Question: How are the Q&A's tied together in the website database?

bsugar is asking a question about website
Follow this topic

by bsugar | April 15, 2019 06:06 | #19064


I've been working with some of the data from the website (in beta) and have noticed something peculiar for which I cannot find any pattern to explain. The short story is that given the fact that all answers can also be found as a comment (let's call them "anscoms" for now), there are 36 anscoms unattached to any question (25 questions are missing) and 173 unique ids from 502 anscoms that are not also found in the answers.

Put differently, if you enter the nid of one of 36 "anscoms" (25 unique nid's) in the list of questions, you will not find a question, and if you enter the nid of one of 515 "anscoms" (173 unique nid's) in the list of answers, you will not find an answer. Of those 515 "anscoms", 13 are replies.

A few premises:


  1. A "question" is a "note" with a powertag of question:some_tag. Therefore, a question will appear in both a list of all notes from the website as well as a list of questions.
  2. All "questions" are "notes" but not all "notes" are "questions".
  3. For every one "question" there exists one and only one "note" that is the same as that "question".
  4. This "question" and "note" share the same "nid" (node id).


  1. An "answer" is also a "comment" (I'm not clear on what distinguishes it).
  2. All "answers" are "comments" but not all "comments" are "answers".
  3. Some "comments" are in reply to an "answer".
  4. "Comments" in response to a "question" can be threaded.
  5. "All "answers" have an "id" (or elsewhere, "aid") in the list of "answers".
  6. Not all "comments" that are either also "answers" or "comments" in response to a "question" have an "aid".


Example 1: Is anyone doing any work with fungi? or bioremediation?

Question Data:

csv nid uid title
0 questions 13745 498969 Is anyone doing any…
csv nid uid title
0 notes 13745 498969 Is anyone doing any…

Answer Data:

csv nid uid aid accepted content
0 answers 13745 499993 149 False Hey Mushroomman!..
1 answers 13745 237313 251 False I’m late to the party…
csv nid uid cid aid reply_to thread content
0 comments 13745 579767 22223 149 22382 NaN Hello Jlmaybach…
1 comments 13745 499993 22382 0 0 /01 Hey Mushroomman!..
2 comments 13745 237313 22481 0 0 /01 I’m late to the party…

Analysis: This is pretty much what I'd expect of that question. As stated above, each question is also a single note, each answer has a separate ID. I'm inferring that they are answers because they are on the top level of the thread. While I might expect to see a corresponding "aid" in the "comments", it makes sense that the reply (first row in "comments") is associated with both the "answer" (aid 149) "comment" (cid 22382). Good times.

Example 2: Have you tried any good DIY microscope dyes or stains?

Question Data:

csv nid uid title
0 questions 18930 579821 Have you tried any good…
csv nid uid title
0 notes 18930 579821 Have you tried any good…

Answer Data:

csv nid uid aid accepted content
0 answers
csv nid uid cid aid reply_to thread content
0 comments 18930 1 23584 0 0 01/ We’ve used watercolors
1 comments 18930 579821 23589 0 23584 02/ Do you have any photos
2 comments 18930 1 23593 0 0 03/ Yes, the purple ones in
3 comments 18930 237313 23594 0 0 04/ Can you give a quick pointer
4 comments 18930 579821 23634 0 23584 05/ Staining is used mostly with

Analysis: Here's where it gets confusing. Based on the first example, my expectation would be that at the very least, I would expect comments 0, 2, and 3 to appear in the answers dataset each with their own "aid". I would also expect that comments 1 and 4 would have "aid" of the 0th (comment 23584) element were it in the answers table above. Instead, none of the comments appear in the answers table at all, and none of them have aid's.


Hi Benjamin! We're partway through a project this month to convert all answers into comments. So you're seeing a transitional database state during the switchover! I hope this makes sense!

On Mon, Apr 15, 2019, 1:07 AM \<> wrote:

Public Lab contributor bsugar just posted a new research note entitled ' How are the Q&A's tied together in the website database?':

Read and respond to the post here:

I've been working with some of the data from the website (in beta) and have noticed something peculiar for which I cannot find any pattern to explain. The short story is that given the fact that all answers can also be found as a comment (let's call them "anscoms" for now), there are 36 anscoms unattached to any question (25 questions are missing) and 173 unique ids from 502 anscoms that are not also found in the answers.

Put differently, if you enter the nid of one of 36 "anscoms" (25 unique nid's) in the list of questions, you will not find a question, and if you enter the nid of one of 515 "anscoms" (173 unique nid's) in the list of answers, you will not find an answer. Of those 515 "anscoms", 13 are replies.

A few premises:


  1. A "question" is a "note" with a powertag of question:some_tag. Therefore, a question will appear in both a list of all notes from the website as well as a list of questions.
  2. All "questions" are "notes" but not all "notes" are "questions".
  3. For every one "question" there exists one and only one "note" that is the same as that "question".
  4. This "question" and "note" share the same "nid" (node id).


  1. An "answer" is also a "comment" (I'm not clear on what distinguishes it).
  2. All "answers" are "comments" but not all "comments" are "answers".
  3. Some "comments" are in reply to an "answer".
  4. "Comments" in response to a "question" can be threaded.
  5. "All "answers" have an "id" (or elsewhere, "aid") in the list of "answers".
  6. Not all "comments" that are either also "answers" or "comments" in response to a "question" have an "aid".


Example 1: Is anyone doing any work with fungi? or bioremediation?

Question Data:

csv nid uid title
0 questions 13745 498969 Is anyone doing any…
csv nid uid title
0 notes 13745 498969 Is anyone doing any…

Answer Data:

csv nid uid aid accepted content
0 answers 13745 499993 149 False Hey Mushroomman!..
1 answers 13745 237313 251 False I’m late to the party…
csv nid uid cid aid reply_to thread content
0 comments 13745 579767 22223 149 22382 NaN Hello Jlmaybach…
1 comments 13745 499993 22382 0 0 /01 Hey Mushroomman!..
2 comments 13745 237313 22481 0 0 /01 I’m late to the party…

Analysis: This is pretty much what I'd expect of that question. As stated above, each question is also a single note, each answer has a separate ID. I'm inferring that they are answers because they are on the top level of the thread. While I might expect to see a corresponding "aid" in the "comments", it makes sense that the reply (first row in "comments") is associated with both the "answer" (aid 149) "comment" (cid 22382). Good times.

Example 2: Have you tried any good DIY microscope dyes or stains?

Question Data:

csv nid uid title
0 questions 18930 579821 Have you tried any good…
csv nid uid title
0 notes 18930 579821 Have you tried any good…

Answer Data:

csv nid uid aid accepted content
0 answers
csv nid uid cid aid reply_to thread content
0 comments 18930 1 23584 0 0 01/ We’ve used watercolors
1 comments 18930 579821 23589 0 23584 02/ Do you have any photos
2 comments 18930 1 23593 0 0 03/ Yes, the purple ones in
3 comments 18930 237313 23594 0 0 04/ Can you give a quick pointer
4 comments 18930 579821 23634 0 23584 05/ Staining is used mostly with

Analysis: Here's where it gets confusing. Based on the first example, my expectation would be that at the very least, I would expect comments 0, 2, and 3 to appear in the answers dataset each with their own "aid". I would also expect that comments 1 and 4 would have "aid" of the 0th (comment 23584) element were it in the answers table above. Instead, none of the comments appear in the answers table at all, and none of them have aid's.

You received this email because you are subscribed to some or all of the following tags: .

Subscribe to all the tags for this post by visiting

To change your preferences, please visit

Report spam and abuse to:

Check out the blog at []( | Love our work? Become a Public Lab Sustaining Member today at []( If this email title has an ID in the format [#0000](/n/0000), you can reply with the email you use at and your response will be posted as a comment on the website.

Is this a question? Click here to post it to the Questions page.

Oh, I see. Funny Gaurav (who's PL username I do not know, sorry!) was just talking about this relative to the way people are using the functionality, which can sort of be seen in the data in terms of how people use the top level comment thread and what sub level thread (responses to a specific comment like this one). It's interesting in terms of common usage vocabulary. For example, Quora and Stack Overflow have some how established clarity on when you are submitting an answer and when you are commenting on an answer. The vocabulary of forums are quite different, and even then a place like Reddit seems to have some how made it the norm to have threads, where I'd say something like GoogleGroups which I believe does have threading capability, has not.

In any case:

I'll try to see if I can find the issue in Github because I'd love to see how that started and where it's going.

In the end, it seems like I decided to do exactly what you are doing which is to do away with the idea of an "answer" vs. a discussion about an answer and just call everything in the comment thread an answer.

I'm not sure that'll take care of all of the "orphans" since some don't have references back to a question but I'll wait and see if the complete transfer takes care of that.

Hi Benjamin!

Here's the issue -

I'm not sure that'll take care of all of the "orphans" since some don't have references back to a question but I'll wait and see if the complete transfer takes care of that.

Although, the issue link have pre-planned goals but we are always open to feedback and suggestions.


Reply to this comment...

Log in to comment