Public Lab Research note


  • 5

Bot for Publiclab

by ryzokuken |

Bot for Publiclab

Why

Being an active and welcoming open source community, Publiclab requires a chatbot for a variety of purposes, including the automation of some critical but uninteresting jobs.

Some examples include:

  • Normal chatbot functionality (greeting new users, fetching some important data real quick, etc) (can be expanded to add features as and when required)
    • We could add a cool karma system (freeCodeCamp's bot uses brownie points) that rewards members for helping others out by awarding them points and taking away for unhelpful behavior.
  • Gitter-IRC sync (currently being handled by matrixbot)
  • Automated PR reviews (currently being handled by dangerbot) (let's keep in mind that our solution would be very specific to our needs and highly extensible. Also, as it comes as a 'service' among many handled by the bot, it would fit in perfectly with all our other features)
  • Automated reviews for critical documents on the Publiclab website (would need more information for this)
  • Keeping track of open issues/assigned issues etc. (this would allow a user to say, print out all the current fto issues right in the chatroom, or maybe print out all the issues some other user is assigned to) (all these can be ideally done using Github's interface, but sometimes you need to demonstrate something and this may come in handy in such situations)

How

We would most probably built such a bot (only if we choose to do so) on Node. Not only does Node sound quite natural for such an application, but it also works quite well when interacting with multiple APIs. We could host the bot on heroku or somewhere similar for a start and perhaps move to our own servers somewhere down the line (if feasible). We could sprinkle NLP at places because (I feel) it is not very important in our context. I have had experience using NLTK in Python and using it in conjunction with a chatbot running Node, but the NodeJS packages are getting better, and we could try a more native solution if we're feeling adventurous.

Case 1: Very Simple Bot

A very simple bot would help guide users on the site and mainly automate generic community interaction.
Working Example: FreeCodeCamp's famous CamperBot would (when someone would say "Hello, World!" in the chat) greet them and welcome them to their community (It was recently brought to my notice that this functionality has been removed from CamperBot). However, what we are seeking to achieve here is not as aesthetic or cosmetic as CamperBot (I mean, don't take me wrong, but just greeting newbies wouldn't really get work done. We need to actually help them get started. I realize it always depends on the person as well as their motivations, but there are certainly a few generic suggestions that apply to everyone, right?)

Example Interactions:

  • Person: Hey everyone!
    Bot: (maybe make it check a database to confirm if the person is indeed a new member in order to avoid spamming the chat channel)
    Hey, #{Person}! Welcome to Publiclab. Click (here)[...wiki...] to access our wiki. Here's a list of few wiki articles to get you started. Feel free to contact @liz or @stevie on this chat if you have any issues.
  • Person: Hello, World!
    Bot: (same old drill)
    Hello, #{Person}! Welcome to Publiclab. Please go through our (Contribution Guidelines)[...guidelines...] and (Code of Conduct)[...] in order to start contributing. If you are new to open source software, take a look at our (first-timers-only)[...] issues. Check out our (README)[...] in order to setup the project locally or if you're unable to do so, feel free to ask @jywarren to grant you free access to Cloud9. If you are still facing any problems, feel free to ask.

Case 2: A Slightly Advanced Bot

This slightly advanced bot would do little more than the simpler version than to provide help regarding help on very specific issues. In other words, this is just a version of the simpler bot with a lot more help cases than the utterly generic.

Example Interactions:

  • Person: @bot help spectrometry
    Bot: Hey, #{Person}! I see you needed help with spectrometry. We have a (wiki entry for spectrometry)[...], make sure to check it out. Also, you can refer to the (Wikipedia entry for spectrometry)[...] for additional information.
  • Person: @bot help bower
    Bot: Hey, #{Person}! I see you needed help with bower. Take a look into bower's (official documentation)[...]. Also there are Github wiki entries regarding setting up bower on the following of our projects:
    • (plots2)[...]
    • (PublicLab.Editor)[...]
  • Person: @bot help bower installation
    Bot: Hey, #{Person}! Bower's official documentation has an (installation section)[...].

Case 3: A full-fledged useful bot

This is a bot that builds on top of the other two but rather than just helping members, it actually performs real functionality as it interacts with APIs (Github and plots2 API seems okay for now). What this does is add unlimited possible "features" that can be added to the bot that involve calling these APIs.

Example Interactions:

  • Person: Hey, @jywarren.
    Jeff: Welcome to Publiclab, #{Person}. Take a look at our first-timers-only issues in order to begin contributing.
    Jeff: @bot issues first-timers
    Bot: We currently have the following first-timers-only issues:
    • publiclab/plots2#1301
    • publiclab/plots2#1198
    • ...
  • Ujjwal: @bot issues unassigned
    Bot: The following issues are currently open and unassigned:
    • publiclab/plots2#1340
    • publiclab/plots2#1339
    • ...
  • Person: @bot pull-request add-for-review 1338
    Bot: Pull request publiclab/plots2#1338 has been marked as finished, and has been added for reviewal. Our reviewers team will look into it shortly. Thank you for your contribution.
  • Person: @bot pull-request add-for-review 1338
    Bot: Pull request publiclab/plots2#1338 could not be added for reviewal as Danger reported the following problems with your pull request:
    • This pull request doesn’t link to a issue number. Please refer to the issue it fixes (if any) in the format: Fixes #123.
    • You have added multiple commits. It’s helpful to squash them if the individual changes are small.
    • ...
  • Liz: @bot notes awaiting-moderation
    Bot: The following notes are awaiting moderation:
    • <title of a note linking to it... I hope you got the point>
    • ...

Now, before you say anything, I realize that this solution just works. In other words, it's almost outright hideous, but it might be the farthest we can reach without running into NLP. Therefore, the next mention has to be...

Case 4: The bot Publiclab deserves but might not need right now

This is just the last bot without any features, but one minor addition that takes it to a whole new level of coolness (did I forget to mention complexity?) -- Natural Language Processing and/or Machine Learning.

Example Interactions:A sample use of NLP
Person: @bot please put up 1338 for reviewal.
OR
Person: @bot, I am done working on 1338.
OR
Person: @bot, 1338 seems good to go.
Bot: Pull request publiclab/plots2#1338 has been marked as finished, and has been added for reviewal. Our reviewers team will look into it shortly. Thank you for your contribution.

A sample use of ML (and also NLP)
Ujjwal: @bot, Which issues are currently unassigned?
Bot: Ujjwal, the currently unassigned issues, sorted by your preference for the tags are:

  • publiclab/plots2#1340
  • ...

I hope I was able to make my ideas clearer and make the abilities of the different types of bots concrete, hopefully making it easier for everyone to decided which one will suit our needs while still being simple enough to be worked on by a group of people as large as possible (we have worked so hard to make plots2's codebase approachable and simple, it'd be a shame if a lot of people feel alienated by the codebase of the chatbot instead. Please feel free to comment on this post for suggestions and clarifications and add any feedback and suggestions in http://pad.publiclab.org/p/bot. Thank you.

Architecture (suggested):

  • Hosting: heroku
    We will probably try to make the bot work in all environments (development, production and staging) on a container so that we don't have to make a switch later when we have expectations to fulfill.
  • Scripting Language:
    • Node JS (because of obvious reasons)
    • Python is a solid contender (because it is one of the most popular scripting languages, also it is the only language apart from Node JS to have a Gitter module, but mainly because it has the NLTK [highly situational])
    • Ruby follows closely (because plots2 works on Ruby and I think Ruby has the best support for the Github module [somewhat situational])
  • If you require any other specifications, please let me know.

Flow of Data

The basic flow of data from and to the user and various other sources of information can be visualized as: Flow of Data diagram

Concept Design (need feedback from users and community, especially @stevie and @liz)

The chatbot's interface (on publiclab.org, on chatting services, the bot would have to conform to the service's norms) must be minimal (we would need to avoid overloading the user with information at all costs) and non-obtrusive so that it does not affect current look and feel of Publiclab.

As we are looking for minimalism, the chat interface would take more cues from the Lounge interface (currently in use for http://chat.publiclab.org/) rather than the original IRC webclient.

The actual chatbot interaction would take place in a frame that would float over the main page of Publiclab (sort of like a popup, the IRC chat also works this way, I suppose) and would be triggered using a dormat-looking FAB. (For reference, here's what a Floating Action Button looks like according to Google's Material Design Specification):

How a FAB looks like

This design is heavily opinionated and I would love to listen to your ideas/concerns.


Will keep updating this thread as and when things get clearer. Please feel free to provide feedback and suggestions.



gsoc publiclab irc node bot gitter soc-2017-proposals 123

response:13975

27 Comments


Hi, Ujjwal! I'm particularly interested in the possibility for a bot for publicab.org; do you think you could sort of mock up a "script" to explore and articulate what this would look like so people know what you're proposing? Just an example like:

NEWCOMER: How do I fix this camera?

BOT: Hi, I'm an automated welcoming system. Can you choose a topic that your question most relates to? (Options)

Or something like this?


@liz @stevie you'll find this interesting!


Hi! Neat idea. Thanks for posting! Exploring this further, I don't know much about bots, but I think it could be really helpful for newcomers. Just because I don't know much about bots:

  • Could we articulate who sees the bot? for example only first time posters?
  • Where might the bot live? On the dashboard or in different places around the site?

For the sake of wondering if this is a good fit, does anyone know:

  • What could be some of the drawbacks? Does anyone have "bad bot" experience? (great posts above @warren)
  • Anyone know of any user studies on the effectiveness of bots? What about in comparison to instructional videos.
  • One concern I have would be making sure it doesn't overwhelm or frustrate users, what are ways we can make sure people only see the bot when they want to?

No worries if you don't know the answers to these, and maybe some of them are flexible based on what would be built. Interested in exploring further.


@warren Definitely had this in mind. We already have quite a few automation mechanisms in place, including the message mirroring. But if there's something a prebuilt library cannot provide, it is automation on publiclab.org itself. This was exactly what I had in mind when I came up with this. A bot that manages "all of publiclab's automation needs". As of the platform, people can catch the chatbot side of our versatile bot on any of our chatrooms. Also, it could be pretty neat if we could embed a mini-chatroom of sorts within the webapp itself. Tell me what you think about it


@liz @stevie Regarding the clarifications for the bot. I need you to realize that the bot is actually a huge cluster of tiny microservices (let's say, for instance, that we have a microservice for the sole purpose of finding jokes on the internet and serving them to the community) which are abstracted away by a layer of normal chatbot functionality. In a way, the chatbot service works as an I/O service for our bot. People would tell their needs to the chatbot, who would perform multiple tasks using the other services and then serve the response back on the same channel. Ideally, users should use one of our multiple chat mediums to talk to the bot (we'd definitely have it listen to the messages on both IRC and Gitter. And we would possibly add it to Slack if the community over at Slack deem it useful for their work), but we could definitely also add/embed an interface over at Publiclab.org itself, so that we do not need to redirect users who wish to use the bot towards one of the chats.

Regarding examples, there are multiple chatbots working hard as we speak, but I realize that most of them are not used for community management. An important and relevant example would have to be FreeCodeCamp's camperbot. Not only does the bot greet new users and make the community more "welcoming", It also serves hints for the problems, rewards members with brownie points and serves data from FCC's wiki on request among many other functions. Here's the link to camperbot's source code : https://github.com/FreeCodeCamp/camperbot Contrary to camperbot's functionality, my very own open source community in college has a Gitter bot for itself. Here's the link to it: https://github.com/osdc/osdc-bot

Although this bot started with purely recreational purposes (it serves jokes and insults still), things quickly got serious as we ended up adding quite a few features in it, including the bot's ability to trigger deployment of any of our projects, including itself. We plan to expand it by making a few neat features like making it tell when the next workshop is, and so on. I hope you got the idea. Will come back with more input soon.


Hi @ryzokuken this is fascinating! Will we need to write specific interactions, like @jywarren started mocking up? I'm very interested in all the directions you laid out, and starting to answer @stevie's questions. One thing i will hesitate on is the "brownie points" concept as other than signs of community appreciation (barnstars), we've been avoiding points, reputation, rank, and certification models.

What are the next steps towards this idea? Maybe we could use this pad to write it up together? http://pad.publiclab.org/p/bot


Hey @liz! One simple and solid solution could be to make our bot "dumb" by hardcoding certain special interactions. In this case, the bot will realize that "@plotsbot help" means that the person is asking for help and "@plotsbot questions @ebarry" would mean that the speaker (or the type-er :P) is interested in all the questions asked by user @ebarry on publiclab.org.

An alternative would be to use NLP and ML in our bot from Day 1 (because it can always be added later in the model I described above). While certainly a few thousand times difficult to achieve than the simpler alternative, this would allow the bot to not only understand Natural Language, but also provide better responses based on its past experiences. Another problem that arises with ML is, that I don't know to what extent can that technology be tested (Like... really, how can we predict how the bot will act?)

As of the brownie point system, I respect your stance, and feel that a single model is sufficient for community appreciation.

Should the community and general administration agree with the proposal, a good next step (in my opinion) would be to make a formal design document that formally states the purpose and important details of the bot and how it will operate. That document would be used be contributors like me as a reference.

Thanks


Thanks for the pad, @liz. I added a little suggestion box for the community in the pad.


@warren Did you guys finally decide anything?


Hi, @ryzokuken -- have you had a chance to try writing some sample scripts, as I suggested above? Also, i think it's worthwhile to write out in more detail each step it'll take to implement the first couple bots, the ones you write the scripts for.

And potentially the steps to get a bot working on the PL website. What infrastructure will you need? A VM? Can you test it on Cloud9? How will a bot post a comment, or find comments? What API calls will you need plots2 to offer for reading recent posts, comments, or posting comments?

Thank you!


Hey @warren! I have indeed been working on the sample scripts, but the kind of request/response model the bot follows would totally depend if we're using ML+NLP or if we're hardcoding conversations. If you could tell me which one we'd proceed with, I could totally put up the scripts for reference (I would need to write them in code anyway).

Regarding the minor details, I guess it'll be more fruitful if we could discuss that in person, because:
A) I would be able to receive instant feedback and actually "discuss" rather than just stating my opinions.
B) This being a collaborative project, the opinions and doubts of everyone need to be addressed, rather than again, me stating my preferences and opinions.

Please feel free to drop me a mail or a comment in here regarding when you would be free to discuss this in the next couple of days :smile:

Regarding the architecture, again, that's something we would need to discuss and that's exactly what I meant when I said that we need to make a design doc. Personally, I believe the best way for us to do this would be to host the bot on heroku for the time-being and trying to shift it to our own servers later in the future if we have the given resources (which are pretty minimal, tbh) and only if it promises a significant speed boost. (Which it will, if the majority of the bot activities would happen at publiclab.org rather than on Slack or Gitter or OFTC).


Hi, I'm happy to meet with you perhaps on Monday -- what times are you free? And as to the scripts, I think starting with a hard-coded one is a good place to begin, but I also think that posting one of each would help people understand the pros and cons of an NLP-based approach, by contrast to a hard-coded script. I'm not asking you necessarily to post a complete interaction as it'll be coded, but to illustrate a few initial options to help people understand what your bots could do, by example. Thanks!


On Monday, I would be free from around 4 PM in the evening to around 12 - 2 AM in the midnight. Considering you to be in the Eastern Standard Time timezone, I'd be free from 6:30 AM to 1:30-3:30 PM of your time. Thanks. Will be putting up a few sample scripts over here later today.


@warren Totally forgot that I have this week off as a holiday. Monday is the main festival (BTW, Happy Holi, everyone) but I'll try to stay online as much as I could. Ping me up whenever you're free. Could not work more on the scripts today because I had been working on the containerization. Hopefully, we would be able to provide a container-ed alternative to the usual development workflow and then automate almost every single bit of it by the end of this week (stay tuned :D)


@warren @liz @stevie Updated the note... Hope you guys find it up to the mark. There is some issue with the markdown being rendered, will look into it. Meanwhile, would you guys prefer me to put it elsewhere? Thanks


Ah this is super helpful, thank you @ryzokuken.


Hi, so on number 1, would that run in our chatroom, or where exactly? Do you have some ideas on one to run on submitted questions, for example? Maybe @stevie has thoughts on the kinds of helpful things a bot might ask someone for (like, if a photo is missing, or tags are missing, or something?)


Hi there, I'm also wondering about where it would live. I think a lot of first time PL users don't see or go to the chat, so I'm wondering if a bot on the dashboard might be more helpful if that's possible. I have a few ideas:

  • I'm wondering if a bot could be used to help first time people who come with a specific interest, and direct them there more quickly. For example,

    • if someone comes with: "I just have a question". The bot could help get them to the Q/A interface and help them write a good question.
    • If someone wants to "share work" maybe the bot could help them find what subject they're interested in and direct them to: answering questions, updating a wiki, posting an activity on that topic.
  • I think a bot might also be useful in helping people learn about what's on the dashboard and help them sort what material they want to see (the tab where you can select questions, research notes, comments and events).

  • One other idea I'm thinking about is helping people who aren't coders post GitHub issues. I've started to see things that might be website/bug issues posted on the Q+A. A bot that could help people distinguish between the two and post accordingly might be really useful.

Also one question I have is how hard would it be to change what the bot does once it's made? My concern is, for example, if we have a bot that helps people navigate the dashboard, what happens if we change something on the dashboard?

Thanks for your work on this. It sounds really neat!!


Thanks for the feedback and appreciation, @warren and @stevie. I've thought over some of these issues, and would definitely love to discuss those. But commenting on this thread might not be our best option. Whenever you're free (if you're free at all) I would love to talk about these and any other issues you guys have in detail.


Hi, @ryzokuken -- I'm happy to meet tomorrow, but if you could try putting in some of your next questions here, it'll be easier for us to all stay in sync even if we can't all get on the same chat; of course, I'm happy to chat, and look forward to it tomorrow, but just thinking beyond the 1:1 discussion and how to get others' viewpoints in here.

Thanks!


Hi, @ryzokuken - saw your draft diagram on the chatroom and it's looking great -- left some comments there. Hope you're having a great weekend!


Updated the note with the flow of data diagram. Hope it helps in visualizing the flow of data to and from the bot. Please feel free to voice your concerns about the diagram and the bot in general.

@warren: Regarding your concerns with the diagram:
1. From User, here I mean the user interacting to the bot directly. Users would obviously interact to the bot through Github, but user actions on Github and Publiclab.org would seldom "trigger" some action on the bots' part. Atleast that's what I have in mind until now, although everything is subject to discussion and change.
2. Concerning which all actions would be required by the bot on each of these interfaces would depend highly on the use cases we put to work. The bot might or might not be required to post research notes, for instance. It would all depend on what the feature we're working on "requires" the bot to do.

Thanks


Ah, great, thank you! As to actions required, I think we might start with some simple ones, like:

  • read listing of recent pull requests
  • read pull requests
  • read pull request comments
  • comment on pull request
  • all the above but for issues
  • read listing of recent research notes
  • read individual research notes
  • read research note comments
  • write research note comments

Not that we'd have to tackle all of these at once, but it'd be good to know how many distinct API URLs we're talking about. And for the Public Lab site, how many we'd have to create -- would this, for example, require making a unique Public Lab user account for the bot, and creating a unique token system so that it could post without needing to log in with a session?

Thanks!


@warren @stevie @liz Added the design guidelines. Please look into them and give feedback whenever your schedule permits.

@warren: I would look deeper into Github's API for these sample use cases (as they are pretty general use cases, I highly doubt that we would skip any of these in the original implementation). Meanwhile, let's see if the required functionality is available for the Publiclab API, and work on the calls that we need to make but which do not already exist.

Regarding the authentication part, isn't the API public, or does one need an Auth token (maybe something like OAuth tokens) in order to make requests? If so, then we would need to indeed make an Auth token for the bot and allow a sufficiently large number of calls (per day, if that's our metric) so that the bot functions properly. In that case, we would store our Auth key along with the other Auth keys (we would need OAuth tokens from Github, Gitter, Wikipedia, Google and any other services we use) in the environment file.

Otherwise, if the API is totally public without restrictions, I don't see why we would require an API key. An advantage of dropping the API key model (among a sea of disadvantages) that I would like to mention would be that a contributor would need to also register for API keys for all these services. Every single API key that we manage to cut down would slightly decrease the time and work required for setting up the project.

Thanks


Hi there! thanks for your work here. Probably because I don't know much about how programing works, I'm having a hard time understanding what you're asking for input on. Could you phrase your question or what you're looking for input on in a way a non-coder could better understand? If not, no worries but I know @warren is mostly away from internet for the next couple days, so we might need to wait until he has a moment to get back on and catch up.

Thanks again!


@stevie I would require input on the design side of things, as we would need the design to "just work" for everyone. I proposed a basic design concept (one that uses FABs and floating panes). I would love if you guys could propose ideas and make suggestions regarding the design as the current concept might not be the best we could do. Also, it would be super cool if we could somehow also involve people who are active users of the website to know what exactly they expect from us both feature-wise and design-wise. Thanks.


You must be logged in to comment.