Revolutionizing Conversations: AI Chat Integration on YaleSites

Time

Thursday, 9:30 am CDT - Thursday, 10:30 am CDT

Location

Room 314B

Description

Our condensed five-week journey is an epic tale of innovation and collaboration, where Drupal becomes the canvas for a groundbreaking AI experience. Immerse yourself in the pivotal decisions we made during the rapid development phase, and witness how our team navigated the complex landscape of data curation, ethics, and design to ensure the seamless fusion of AI into the fabric of YaleSites. In our commitment to inclusivity, discover how our team prioritized efforts to make the AI chat accessible to everyone, including users with disabilities. We'll share the strategies employed for ensuring a seamless experience, detailing the incorporation of accessibility standards and the rigorous testing conducted to validate our platform's usability across a spectrum of diverse user needs. Be inspired by our approach to data curation, ethical considerations, design principles, testing, and more.

Speakers

Randy Oest

Creative Director @ Four Kitchens

Randy Oest is a designer, thinker, and creative director. He works at Four Kitchens, is an avid Star Trek fan, and plays too many board games.

Twitter: @amazingrando
Email: [email protected]

Track

Back-End

Design

End users / Content Admin

Project Management

Feedback Form

Transcript

Thank you all so much for joining me the early this morning. Uh, if you don't have coffee, there's plenty of coffee out there to to help kick start your day. So we're here today to talk about a project that Four Kitchens and Yale worked on for doing AI chat integration. Basically, this is kind of like a post mortem to kind of talk you through, like the challenges that we face, the things that we did. Um, and I'm going to try and leave some time at the end for questions because I'm absolutely sure there are probably going to be questions. So. All right. So without further ado, first, a little bit about me. I'm the creative director at Fork Kitchens. I've worked at Fork Kitchens for the last eight years. I've talked to Todd. When I hit ten years, we're going to add a fifth kitchen. It's going to be amazing. Uh, I am also a giant tabletop role playing game nerd. Um, I play a lot of Dungeons and Dragons and games that aren't Dungeons and Dragons. In fact, I volunteer at my local library to run RPGs for their teen program.

Um, it's really rewarding. And I just ended a two year campaign centered on the premise of What If? During the Cold War, the Space race also had elder gods like Cthulhu. All right. So from 1947. Whoa, that is not right. I think the internet hates me today. Uh, give me just a moment. Oh, I just want to say doing this at this angle, it's terrible. Yeah, I've got an internet problem. Give me just a moment. So while I'm fixing that, working on fixing this, who here has been doing. Of course it's going the wrong direction. Um, who here has been working on AI in their on their projects right now? Anyone got anything live? Of course you do. Um. You've got something live. Great. Yeah. I have no internet. All right. So let me switch over to my phone. It will load all of my images, and then I can, you know, proceed. But.

>>:
Um. Let's see. I know that we were having some internet problems yesterday. Sorry, I thought I could vamp and troubleshoot. That's not. True. Yesterday I was thinking. Did you put stuff in? As they. Yeah. There was a time where I would, like, never do a live demo like screengrabs of everything. That's. Sometimes. It was better. Do with that. Stolen right back. All right. It looks like we are good now. All right. Thank you all for bearing with me for technical difficulties. Um, I guess the internet needed some coffee this morning as well. All right. So, like I said, I was I had this great segue talking about the space race. Um, and I actually wrote it into the talk here because the space race fascinates me. Okay. From 1947 until 1991, the United States and Russia were bitter rivals. These two superpowers had an ideological and geopolitical struggle for global influence, and the space race was one of the most significant ways that the US and Russia competed. So to achieve the impossible mission of getting people into space, each side had to be inventive and focused.

The United States and Russia both wanted to be the first to put a person on the moon. So John F Kennedy was lukewarm on NASA whenever he was first president. He really didn't want to spend any money on the space program. Um, in fact, he would have scuttled it, uh, if he could have. Now we're going to get back to Drupal in, I, I promise. Okay. Um, however, during John F Kennedy's presidency, the Russians were the first to put a person into space in a single 108 minute orbit. And this embarrassed the United States and lit a fire in John F Kennedy. Um, JFK immediately asked for congressional support of the space program in a speech titled Special Message on Urgent National Needs. Okay, so we're going to bring this back to our work with Yale for kitchens does a lot of work with Yale. We're generally not allowed to talk about it because they've got NDAs for that sort of thing. Um, but what's nice is that they've given us permission to talk about it today. Um, we've actually spent the last 2 to 3 years creating the latest version of what Yale calls YaleSites.

And YaleSites is a platform that allows the easy creation, deployment and maintenance of Drupal websites. There are hundreds of websites at Yale that run on YaleSites. It is amazing and it warrants its own talk. In fact, I think at Drupalcamp new Jersey last week, the team at Yale and one of our engineers gave a talk on all of the work that they've been doing there. It's super fascinating. Check it out. That's not this talk. This talk is about the space. I'm sorry, not the space program. It's about I thank you. All right. So now during one of our regular check ins with our our Yale folks, our main Yale contact told us that he needs YaleSites to be a leader in artificial intelligence and AI. So he says everyone is playing with AI. Everyone at Yale is playing with AI. Every school, every department, and every website at Yale is exploring what I can do for them. So we got the speech special message on urgent AI needs at Yale. All right. So the best part is our launch date was five weeks from that conversation.

Go. So this is for kitchen's first AI project. We've all been kind of experimenting toying with it. What can ChatGPT do for us? You know, what can we do with OpenAI? What can do do with Midjourney, things like that. Um, so this was for kitchen's first AI project. The team at Yale had a proof of concept. There was a chat interface that was running on Azure, OpenAI studio, and duct tape, um, to call it an alpha is being generous. It was a lot of hard work. I don't mean to demean what they've been doing, but it was very, very fragile. Um, we had a very short timeline, but fortunately we actually had a lot of budget, and we had a lot of smart people, both from Four Kitchens and Yale, working on this project. So the planning phase for the project was easy because there were only two questions. What people do we need and what are the launch requirements when you don't have a lot of time? You can't write a lot of JIRA tickets, right? So what people I'm sorry, wrong bullet. Anyway, so we determined that our team was going to be a back end engineer for four kitchens and three engineers from Yale to focus on the Azure AI stuff.

Okay, so pulling, scraping the data, integrating the data, doing all of the like, answering all of the unknown unknowns that we need to have answered, we had another back end engineer just for Drupal tasks, um, making sure that Drupal was able to integrate with what we were doing. Uh, we had a front end JavaScript engineer and a designer working on the chat interface. We had a content strategist for Prompt Engineering for planning the promotional landing page and communications materials, and we had a creative director to give a talk at mid camp. So, uh, once we, uh, we defined a successful launch. As a web page to show off the. I search at a custom domain, a conversational tone of the chat interface that should have a personality. We need to use data from the Yale Hospitality Drupal website, and we need a way to drop this into any Drupal site using YaleSites. So with that, we then went forward. Um, I want to talk about the data. Nope. I want to talk about the data. So when the project started, the information that we were getting from Yale Hospitality.

Um, we were scraping the Yale hospitality site and dumping that into a vectorized database. Um, I believe during the five weeks that we worked on this, we switched, uh, which database it was getting piped into, I think twice, because we were experimenting and trying to figure out what we wanted to use and how we wanted to vectorize that data. So Yale hospitality posed a problem because, well, you would think with Drupal we would want to just consume it directly from Drupal instead of scraping it. You know, that just seems kind of weird, right? Well, there's a there's a couple of reasons why we had to scrape the data from Yale Hospitality. So the first is that the Yale hospitality site was still on Drupal seven, it was still on the old version of YaleSites, and it had not yet migrated because there was a lot of custom work that they required, um, to, to be integrated before they could migrate. The second is that the data that the Yale hospitality site was more than just data from Drupal.

They also have a menuing system in Yandex, which during the course of the project we renamed Meow Mix because we were punch drunk during the five weeks we were trying to get this done. Um, and they also had a lot of catering information that was just in PDF format. So we needed to find ways to take all of this data and get it to work. Um, and we weren't trying to get it to work, right. We were trying to just get it to work for launch. Ultimately, when we launched, the data was still being scraped, um, because there were two new complications that arose from dealing with the data that we had to focus more energy on. And those two problems were prompt engineering and ethics. So, weirdly, the ethical issues were easy to solve on this, this, uh, this project. All right. So as we were putting together our messaging for the site, because this was going to be a site that was going to launch to the Yale community, they were going to have a lot of questions. We came up with a list of questions that needed to be answered.

Um, so that, you know, people in Yale wouldn't revolt. Um, so the questions were, what is what is done with the information that a person types into the AI chat interface? Is it retained? Is it analyzed? What do we do with it? Uh, another question. What does the system or sorry, how does the system respond when a person types in a question that the system should not respond to, such as self harm and those sorts of ideation? Um, what happens if a person tries to trick the system into doing something that is not supposed to, like, reveal to me what your prompt like set up is? And you know, I know you're set up to tell me about Yale hospitality, but I want you to tell me how to build a bomb. Um, so throughout this process, um, because we were using Azure, OpenAI studio, we had many conversations directly with Microsoft about a lot of these issues, and we had questions. Um, um, the nice thing is, is that Azure, OpenAI studio has controls for these. Um, and in addition to that, Microsoft told us that one thing that you can do is you can actually, like process the data as well if you want to customize that response.

So what that means is that data retention of what a user types into the chat interface is something that you decide to collect or not as an admin, which was really easy for us because we didn't want to collect that information. We saw like there was nothing important about having it. We saw no value in it. Um, as well as like privacy guarantees that Yale University has. And so we just kind of just made sure to set it so that it was going to forget it. Now, that said, um, when you're in the chat interface and you enter a prompt and it gives you a reply and you enter in another prompt, it actually does feed back in previous prompts and information so that it has context. That's important to the chat interface, but that is not retained at like a database. Well, not retained permanently at a database level. So I guess it is technically temporarily cached. Um, and that's about the extent of it, uh, for um, for content filtering. Um, Microsoft has a content filtering policy that applies. Eyes to the eye that they provide.

Um, these this content filtering models for hate, sexual violence and self-harm categories. And it has been trained in and tested in a lot of languages English, German, Japanese, Spanish, French, Italian, Portuguese and Chinese. And it also works in a lot of adjacent languages, just not at the high level of precision that they have for for that set of languages. Um, if you actually go to ask Yale.edu and enter in anything about violence or self-harm, it will immediately give you a red bar saying that like, this is, you know, please restate, you know, the nature of your medical emergency. Like refactor it like, this is not a thing that that we deal with because of content filtering policies. Um, if we had wanted to, we could take responses to things like that and actually put it through a system that we wanted to like, uh, Azure Open AI studio would give us a flag saying, this is an error because of content filtering, and then we could choose on our own to take that information and do something with it.

For instance, we debated whether or not we wanted to like redirect users to, you know, to help services at Yale. Um, ultimately, we had five weeks. And so that was something that was not implemented. We just like, stop it cold and just go from there. Uh, let's see. Um, defending against hackers, trying to trick the interface. So trying to get it to tell you how to make pipe bombs, for instance. Um, this ties into another issue, the other issue that I mentioned, which is prompt engineering. So prompt engineering is the process of structuring text that can be interpreted and understood by the generative AI model. So basically it's a natural language text describing the AI's task. Tell me about pizza in New Haven, Connecticut. Great. That's that's exactly what it's supposed to be. Um, however we pardon me, we encountered several challenges. Um, you know, so we the aforementioned hacking, the API. Um, we determined or this was our first big AI project. So we determined that or learned that prompts are not deterministic.

And so you can get wildly different results from the same prompt. Um, there was the poorly structured content was causing the system to return wrong results to prompts. Um, and getting the voice and tone to have a personality was a was a fun challenge. We actually enjoyed that a lot. So to break this down, uh, to combat against problems with what your AI generates, you have to do what is referred to as red teaming. Uh, red teaming is a structured testing effort to find flaws and vulnerabilities in the AI system. So the term red teaming itself was popularized during the Cold War and began to to be formally integrated into war planning efforts by the US Department, uh Defense Department and simulation exercises. They were referred to as red teams because they were the Russian teams trying to cause problems. Listen, I like the Cold War in the space race. We're going to have a lot of those references. Good morning. So, uh, so red teaming generative AI is honestly an entire talk into itself.

Um, the the community is trying to figure out what exactly red teaming is and how to execute it, and how to know that you've executed it. Well, um, there's no solid definitions for like, what it actually is, but the short version is that you want to try and exploit the AI, you want to try and create problems. What you're going to do is you're going to, um, uh, you're going to take, uh, like, sorry. So, um, generally, I like to think that that what we do, we're good people. We're trying to make sure that we make the world a better place. Um, it was actually really fun in this process to kind of put on our black hats and pretend to be bad guys, uh, trying to get the AI to do things that wasn't supposed to. Sometimes we were successful, which was great, because then we were able to modify our prompts and modify their response. Um, so that was that was a really fun kind of challenge. Um, let's see. Um, so one of the other things that I mentioned was getting consistency, uh, from responses, making sure that we kind of like it wasn't wildly crazy, but it also wasn't like giving us the same sentence over and over again.

So consistency and randomness, randomness and responses with generative AI come from a variable called temperature. Temperature has a value between 0 and 1. The higher values, like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. This is where you can have like, you know, a very creative AI response or a very like straight and narrow kind of AI response. Um, and so, um, one aside for temperature, I know that that yesterday during the AI talk, um, there were a lot of like plug ins for adding in, um, was it. Uh ChatGPT. You can you do not have controls for temperature at that level. You have to actually use an ChatGPT or OpenAI. Rather, you have to actually use the API to be able to adjust the temperature. And that randomness you can in your prompts, try and push for it, but you can't actually change it without using the API. So this was one of the benefits of us engaging Microsoft and using, uh, Azure. OpenAI studio is that we had like all of these dials under the hood that we could manipulate.

So, um, the last problem with prompt engineering that we had is that, you know, the content that we were putting into the database to train the model was, was bad content. The concept of garbage in, garbage out is true. AI does not magically make like, you know, your content soup into content sandwiches. It does not make structured content, it only knows what it is getting input. And so if content is challenging for a person to read, it is also difficult for AI. Um, we discovered this because one of the big things with Yale hospitality that we were querying was about food. Like, hey, I'm a vegan and I'm looking for some food. Can you recommend, like, you know, what's near my my classroom and what foods I can get? And generally it was good, except, you know, it would quite often list chicken nuggets. So if you're a vegan, you probably don't like chicken nuggets or you probably don't want them like, I mean, maybe you like them, I don't know, some. Anyway, um, so with content, we discovered that if you were able to structure it better, like using headings and lists, things that will make the content more readable for a person will also make it easier for AI.

In fact, our content strategy practice has started to adopt AI as a user of content. And so we've been talking to our clients about how, like, if you're thinking about AI in the future and you want to use it, you need good content to do that and so you can get better answers. Um, also, using AI forces you to remove old and outdated content. Um, you may have hundreds of event nodes in your site that really aren't important to you because they're just they're kind of sitting there. They're not doing any harm. Well, when you start ingesting data into a model to train that is going in and it's going to be part of the answer set. And so you need. To either exclude it when you're doing your data import or data scraping in this case, or you need to have a good content policy about like, it's old, we don't need it anymore. We're going to thank it and remove it from the database. So now what about the the interface? This is my favorite part of the creative director. This is all the stuff that I get excited about.

Um, uh, but we needed to design and build an interface. So in five weeks, building something from scratch is right out. Nobody has time for that. We need to to find it. Get a leg up. Fortunately, Microsoft has the Azure Chat Solution accelerator powered by Azure OpenAI service. They're really good at naming things, people really good at naming things. So but basically this is a react chatbot that has all of the basic functionality put in. It responds to error messages you can like. It will animate the the the chat as you're entering information in it is a significant leg up. And if you use their fluent design system, um, it is it's really nice to be able to just, you know, adapt it to your adaptions to Fluent Design system. If you don't use the Fluent Design system, like we don't, we have a custom design system for Yale. That was a little bit of a challenge, but we got over it. It was fine. What was what was not fine is remember earlier how I said that we had a front end JavaScript engineer on the project?

Well, that only sort of, um, so we had a great front end engineer with some react experience. We didn't have anyone who was available who fit the bill. So we had this, this front end engineer who was really good and had a little bit of react experience. Um, and then we had myself, I build many react projects and apps for tabletop RPGs on the weekends. And so it was the two of us basically working on the front end code, uh, making sure that it was responding and doing the things that that it needed to the, the front end engineer was taking care of the implementing the styling from our designer. Uh, and I was engaging with the designer, talking to them about what the limits of this, uh, this Azure chat solution accelerator powered by the Azure OpenAI service. I talked about what limitations we had in the design process, and I was also working on the complicated parts of the react app. Um, now, the first question that we had to to answer about the chat interface is, how are we going to put this react app into Drupal?

We got that answer on week four, which was great. Um, the answer to that question was we're just going to insert it using a custom block. Um, and that worked out really well. Um, but I have to say, those first four weeks as we're building it, it's like we can do react, right? We can do react, right? So it's a little bit nerve wracking, at least for me. Now for the interface, we debated two directions, um, for the way that the visuals were going to work for this. One is doing basically like a chatbot where there's like a little round button in the lower right hand corner of the page where like, you know, you click on it and the chat bubble opens up. This is a common pattern that we see a lot of, like Salesforce and customer service. You know, accounts use this kind of model to interact with people. Um, but I didn't want to go that direction because I did not want this, like this service that we were building to be like, have people connect their mental model to like chat bots, which a lot of people consider useless and junk.

I wanted people to actually engage with this and want to enjoy it. And also the reason that we were building it is so that people could ask questions about the website. Basically, it was about doing like AI search as opposed to like trying to convince them to, or helping them with customer support or helping them through a sales funnel. So where we ended up was a full screen takeover. Um, and I'll, uh, I'll show at the end. I'm going to show you the actual like, ask website so you can take a look at it, but basically like it's a full screen takeover. And this allowed us to, um, to custom brand it. Uh, one of the big questions that we had throughout this process is, does does this like chat interface, should it adopt all of the decisions that the individual YaleSites do? Because they have each Yale site can choose like color patterns. They can choose like different priorities, like if they want to have like be more bold or be more like conservative. They can like pick these choices that affect the what their website looks like.

And um, I, I lobbied and was successful in saying like, no, I think that this chat interface needs to be its own thing because it is new and I think that we need to brand it and we need people to have a consistent experience with it, so that over the next couple of years, people will know exactly what it is. And then after a couple of years, we can start to have it recede into the background a little bit, then start integrating it a little bit more. Or because it is this new thing. Um. Let's see. Uh. All right. So. All right, um, the other issue, the other reason why I wanted to have, like, a full screen takeover and not have it be like a little button in the corner. Um, is that I kind of websites today have a problem that I don't care for, which is I call it the multiple overlay problem, where you have a chatbot, a cookie banner, an email sign up interstitial, like all of these things kind of like popping up, trying to get your attention. And, you know, whatever is on top is just whoever put the longest number in the z-index for stacking.

So I wanted this to be a good experience. And it was. We launched it. Asked Yale.edu. It is a live website. You can go to it. It's fantastic. We we were ready and we launched Ask Yale on Time with the list of priorities that we talked about earlier. We landed our person on the moon go us. So now when I say we were ready, what I mean is, is that we had a great product to show off, but we also had several weeks of refinement that had to go on under the hood. So, um, I'm not sure how aware you are, but like when clothing companies do photo shoots, all of the clothes that fit the models perfectly are in part because the models are perfect. That's why they're hired, but also because what they're doing behind the scenes is they're pulling the clothes tight, and they're using clamps behind there so that everything fits perfectly because they're clamping it and making sure that it looks great for the photography, not so that it actually fits on the person. Um, and that was kind of what we launched with.

We had some clips that were holding some things together. We wanted to make sure that we were going to be able to do the things that we promised that we were going to do. So we went through this entire process and like the the space program, the, uh, the benefits were numerous and continue to impact our lives today. So, um, some examples from the space race, uh, that benefit us today. Like artificial limbs, scratch scratch resistant lenses, firefighting equipment, dust busters, Lasik shock absorbers for buildings, solar cells, water filtration and visible braces, um, freeze dried foods like astronaut ice cream. Um, and just like the space race this project gave us and Yale many benefits. That intense five weeks that we worked, uh, gave us a lot of a lot of things to work from first. And this was the surprise is how much content strategy needs to go in to a good AI interface. All right. So content optimization for AI leads to improved human readability of the content, improved search engine optimization and improved on on site search results.

And it reinforces Web accessibility best practices because you want to make it very clear what information is. So if you have if you have a a image of a table, um, that doesn't work in your eye, so you fix it for your eye, you're also fixing it for people who you know, are going to then be able to access that data that was previously locked behind an image. Um, in addition to this, the Yale team has created a Drupal module to empower Drupal websites with AI capabilities. Um, this facilitates the transformation transformation of Drupal's data model into, uh, large language models Llms. And it facilitates the creation of embeddings from Drupal content and metadata, enabling efficient content management and transformation into a vector database. Basically all of the like clips that we had when we launched. We now have a Drupal module that will actually consume data directly from Drupal. So we're not doing any scraping anymore. We're pulling in data directly from Drupal and and pushing that into a vector database.

Um, this particular module, which is available at GitHub under the YaleSites. Org. Um, uh, it is focused on Azure Open AI studio, but it is very adaptable and can be used for other AI services. And I think that's part of the reason why it hasn't made its way to Drupal.org yet, is that they want to make it a little bit more robust before pushing it there, so that other users can use it. Uh, let's see some other benefits of this project. Is this defined our prompt engineering practice? Um, so in order to get to get prompts to work and do the things that it needs to, we've started to realize that the challenges with doing, like an AI chatbot or an AI search is less with the. Technical because most of that is wiring up service A to data B, it's the prompt engineering to get the data that we need and the responses that we need and the way that we need it. That is the most challenging. Um, and let's see, uh, also the the excitement around having an AI interface, this AI search has accelerated adoption of the new version of YaleSites at Yale, which itself raises the bar on quality of content and accessibility at Yale.

So it's becoming this virtuous loop of everyone saying, hey, can I get AI? And we're like, yes, you can get on YaleSites. Um, and whenever they do that, they get a lot of benefits because we've got a lot of structure. It's just becoming a virtuous cycle. So with that, I want to show you all, uh, ask Yale, and I'm going to need to rearrange. Oh, actually, I can't rearrange my screen because of the recording. Let's see. Ask. Fast. There we go. All right, so. Um, we have a block on the page that, um, uh, inserts this header. Um, and this header is part of the react app, and it loads. Whenever we click on this button, it opens up the SQL interface. Um, we have pardon me. Um, we have a couple of questions to kind of help people like, kind of like push them along. Um, you know, like what is served for breakfast, lunch and dinner. And it gives us our responses. And one of the things that we programmed in are citations, uh, because we want to give people confidence to be able to check the work of the AI, because the AI is, um, it's an unreliable narrator, to use some writing terms, but basically, um, and the citation, you can actually go and go straight to the site and see what the information is.

Um, we also let's see tell me about Handsome Dan. For those who aren't in the know, Handsome Dan is Yale's mascot. Uh, it's their little dog mascot. Um, if you Google searches for Handsome Dan, you will find one of the most adorable dogs, uh, you will ever see. Um, and, uh, you can see here it gets super excited about Handsome Dan. And in fact, in our prompt engineering, we we have a line that says, like, liberally. I'm not sure if it's liberally use emojis, but, you know, basically we encourage it to actually use emojis. In fact, if we ask about pizza, tell me about the Great Pizza and New Haven. Um, we have specifically, in addition to answering questions from Yale hospitality, the personality that we added is that it gets very excited about Handsome Dan, and it gets very excited about New Haven Pizza. If you have not had New Haven Pizza, I suggest you take a vacation to New Haven. Uh, and enjoy some of the pizza. All right. So here we've just got like some response. Looks like it's in the middle of responding.

And the internet is doing great. So but uh, doing these things live is always amazing. So at this point, this is everything that I've got listed in my talk. I would love to answer your questions. Um, and talk about this a little bit more. Does anyone have questions? Your hand went up first. I'm curious about the selection of vector databases and. Why are you like ended up cycling through a couple of different options, which ones you chose and what considerations you had in rejecting the ones. I decided not to go. That is a great question. The question was, you know, tell us more about the vector databases and why you made the decisions that you made and which ones you cycled through. Um, I have to tell you that I don't have that information. Um, that was our, like, our AI engineer and Yale's team cycling through that, so I don't know. Um, I think pinecone was one of the databases that was used for a while. Um, I don't have the specifics on that. I can tell you that. Um, was it we. I'm trying to think I'm pushing very hard for us to have more content around this particular, like, launch.

Um, so I can if you come up after the talk, I'll get your information, and I will share that. Uh, I think mid camp has a slack, right? Yes. Okay. I'll get that information and post it. All right. Questions next. Questions? Yes. Um.

>>:
Two questions. One. Why are you wearing a spacesuit when you do this talk?

>>:
Okay. Um, two uh, I just I'm trying. To wrap my head around, like, the level of effort required to do something like this. So, like, in terms of, like, this ballpark, like, hours. Sure. All right. So the first question was why don't I wear a space suit while giving this talk? Um, the answer is that they are too expensive. Um, and I focus my cosplay money in other directions. Uh, but the real question was, what are the costs around this? Which is a great question. Um, five weeks, three people at 32 hours a week. Um, you know, that math is going to I can't do it off the top of my head, but that's about the number of hours that we spent that will get you a ballpark on this. Now, this seems like. Okay, compare and contrast this against a lot of what OpenAI is doing, which is like you, you get your API key and you're able to connect it up to things. And you don't have all of these, these, you know, gratuitous expenses. Because there is also on the back end, we're paying for access to the chat model.

So we're not paying $20 a month to fuel this thing. We're paying significantly more than $20 a month to fuel this thing. Um, and the reason being is that we need to access the API and be able to make a variety of tweaks and changes. I mentioned earlier in the talk about temperature and being able to control that temperature. Um, we also have custom, uh, prompt information that gets submitted every time there's a prompt that gets entered in here, there's a suite of information that goes along with it to give it additional context. And so there's a lot of customization that makes this a more expensive endeavor than just using an OpenAI API to connect. Um, obviously, if you can just use an API key to get these results, go for it. Tell me how you did it, because I would love to save that money too. But that's that's part of the cost that we have with this. So you had you had met up, uh, human hours on the on the Yale. Side too, right? Yes, yes we did. Um, which I would probably say that that would probably be two people times 32 hours a week.

Okay. Um, I don't know what their work week is. We we base our work effort around 32 hours a week of billable hours. So. Yes, the content optimization. Where was that done? In the Drupal sites. And then our recommendation was to do that in the Drupal site to do that with the original data. So if you have, for instance, um, we actually pulled an example page and did a compare and contrast, and there was a page that was five paragraphs of text that had all the information that a person needed, but you couldn't skim it. It was difficult to read. And what our content strategist did was then took all of that same information and used headings and bulleted lists to make it very scannable and legible and immediately improved, you know, human readability and the results that we got from that page. And I so like the place to make the change is in the data itself, not as part of a transformation process as the, as the data pipes through, because you want that benefit to be for everyone. And how long did that take?

Well, I mean, that's an ongoing thing. Like, you know, we made the recommendation. We told them they would get better results if they made changes, but then they have to prioritize making those changes. And so like updating and managing content is weirdly one of the things that a lot of website owners prioritize very low. Um, and so like, they're just kind of going through and making changes. Uh, I'm hoping that whenever they go from Drupal seven to the latest version, that we have an YaleSites, which I think is ten. Um, that as part of that content migration process, we'll be able to talk them into like, okay, let's get rid of everything and then make you argue to keep some things, but let's make them better. When we bring them over. So yes. Question. During that refining phase, did you end up making any user interface changes, and if so, why? Uh, the user interface changes. Honestly, the, um, their react app that Microsoft provided, I'm not going to say the name of it again. It's ridiculous.

Um, it provided us with a lot out of the gate and a lot of sensible things, and we did not deviate very far from that. Uh, mostly what we did was we we tweaked what would flag an error in the system, and we, um, designed what that presentation would be. Um, so, for instance, if I type in, um, I think if I just type in the word violence, uh, against, I'm just going to type in violence, I think. No. All right. No, um, and I'm not going to type in something that will actually cause it to red flag. Um, because no one needs to see that in the recording, but it will actually throw up a big red error. And so we we customized it based on those responses. We didn't do many changes other than like aesthetic ones and adding some additional text to the interface. So yeah. Yes. This operates outside of uh. I believe that it does operate outside of Drupal's caching system. Like I said, this is a react app that's being basically inserted through a block. Um, so. But I'm also not super familiar with Drupal's caching system, so I can't say like maybe our engineers found a way to have it cache.

Um, but I yeah, I'm sorry, I can't answer that question more thoroughly. Any other questions? Yes.

>>:
I'm curious. So this is a publicly available page, right. Like anyone. Yes. Yes you can you can go to it and learn more about New Haven Pizza anytime. So I'm curious then are there any safeguards or rate limiting that you put on users to prevent someone from just running up the bill by asking it a bunch of questions or. That's a great question. And one of the things that we considered we talked with Microsoft about that we're like, what are best practices around that? Like, you know, we don't want to like invite people to cost us, you know, like a half $1 million a month. But at the same time, we also wanted to, um, display the awesomeness that we had done and be able to talk about it publicly and allow the Yale community to engage with it in a way that was as frictionless as possible. Ultimately, we ended up on, like, I believe in the back end, we have like caps that like if someone is doing something bad and like just piping in a lot of like prompts, that it will shut it down. Um, but generally we, we decided to keep it open instead of forcing someone to log in with their, their Yale account.

Um, the other concern with that is that if we said, hey, you need to, you know, uh, log in with your Yale. So in order to engage with this, um, it was raising questions of, is this data being associated with me if I ask questions. Right. So despite the fact that, you know, we say on the page, like we're not tracking your data, like if you're logged in, that possibility exists for people. Thank you. You're welcome. Other questions? Yes. Do you. It says here like if you need help with a different topic or does it, like, ever.

>>:
Pass you off to anything else if it. Can't answer? No, we don't have it set up to to hand you off to, to anyone else. So, um, that's I mean, that may eventually be something that we integrate in because so the nice thing about doing this and integrating it with YaleSites, which, which is the platform for most of the university sites, is that the more that we're putting into, like these vectorized databases, um, the more we can start to, to use that data in a more aggregate way. And this is something that that, you know, we haven't started taking any action on, but we have started thinking about which, you know, we've got Yale hospitality. We have, um, a couple of other Yale entities that we're building things out for, the Drupal module that we have for, you know, exporting from Drupal into vectorized databases is so that we can start spinning these up faster. But once we have like dozens or even like 100 of these, like, how do we interconnect them? Like, do we um, we have asked Yale hospitality, we have asked, you know, Yale School of Art, what happens when there's a question that's relevant somewhere else, like, you know, if they're asking about admissions, should it hand off to admissions?

And that's that's a question that we haven't gotten to yet. We're very excited by that. But first, we have to build up all of these vectorized databases and then start asking ourselves, do we create aggregate databases and how do we how do we manage that data? I'm very excited for that future. Yes. And we kind of continue on that same topic that. He asked Yale is. The does the goal then to be like. Anything related to any academics or. Campus life or like, what's the ultimate scope. Of. So, um, with I being as relatively new as it is right now, we're kind of putting feelers out to find out what direction we want things to go. Um, the, the the initial plan and the plan right now is for, like, ask Yale to be customized per site. So the School of Arts asked Yale will answer all sorts of art related questions. Um, but it will not tell you about New Haven Pizza. So it's right now it's targeted to each site so that when users come there, it can be a better version of search and it can help users, um, interact with that particular website in a better fashion.

Um, the idea of having it be like school wide is a a larger kind of question. Um, you know, for like, it's a next step whenever it comes to that. Like, I can imagine a future where, like we have all of these different, like, little vectorized databases, but if someone says, hey, you know, when do I apply? The chatbot will say, hey, I just want to let you know, I'm going to go into admissions mode, and we're going to answer admissions questions. Um, using the admissions database and training model. And then if they ask a question about New Haven pizza, then it can go like, hey, I'm going to switch back to the Yale hospitality model to answer your question or like to prompt them to kind of change where they're getting the answers from. That's that is totally made up by me and the fly right now. But that's what I think the future of this is going to end up being, is that it's going to start changing context for the user based on, like, what their queries are. They can say like, hey, I can answer that, but it'd be better if I was answering it as admissions.

Do you see. I don't know if you did an. Any user testing user feedback. But like, I'm curious if users would get confused by that concept that like here I am asking Yale. And it's like only for this site. Not that that's not great, but it's like if they they assume it's for more than them. And that is ongoing. Um, right now the asked Yale.edu site uses Yale hospitality data, but if you actually look at the site, it doesn't it doesn't give you the context. This site is really about promoting what the tool is and how to use it. Um, and in fact, the ACL chatbot is not on the Yale hospitality site right now. Um, don't ask me why it's not my job. Um, but this this exists, and we're rolling it out to other Yale sites. And so we're trying to figure out how to integrate it, and we'll be doing user testing with users as it gets integrated into those sites. Because, yeah, because this is confusing. You come here and the website has nothing to do with hospitality. Like you would think if you went to ask Yale.edu you could ask questions about Yale.

Um, so really this is a promoting the tool page, not a like actually engage with Yale, although someday I hope it will turn into like asking all of Yale. Yes question.

>>:
The individual asks for the departments and whatnot. I'm sitting here imagining the AI for the library. Let's say for the library website. Where it's trained on all of the like the library catalog and all of the library databases and everything. And it would just be amazing. Oh, straight up, straight up. Um, I'm fortunate enough to be, um, uh, I'm working with a nonprofit client right now on an assistant to aid with, like, um, helping people volunteer, donate services, donate money, donate time. And one of the things that we're doing is an assistant to help with that, basically a chatbot to help them. And one of the things that we're going to be doing is like, you know, with the chatbot, it's like, I want to donate time to an organization. Who should I donate it to? And then it's going to ask them a series of questions to put together a profile, to start making recommendations for them. That's customized. And so in the chat, what we want to do is like surface a lot of these things to make it easier for people to connect with what they want to connect with.

Like some people like going to like listing pages and doing a faceted search because it puts you in control. But other people are like, just let me type into a box that I want to, like give 50 bucks to somebody. Who do I give 50 bucks to? Um, the answer is if you're if you have money to donate and you're never sure, um, donate socks to a homeless shelter, they're one of the the least donated items to homeless shelters. And they are incredibly important.

>>:
Uh, the chat bot does tell me that the library does have Consumer Reports magazine. When I ask if it does, there we go. I wonder if I'm going to ask. It if it has books in them. Awesome. Let's red team it now. All right. Great. More questions? Yes. Did you find you had to augment any of your data that you scraped with, like, more publicly available data sets? Uh, no. We do, we did not. Um, now, I mean, so the LM itself is, you know, kind of trained on this giant corpus of knowledge to make the LM work, like, you know, using, like, chat GPT 4 or 3.5. Um, and then on top of that, we just put the, the data that we scraped from Yale hospitality. Or in the case now, I think it's actually being imported from Yale Hospitality. Um, and we just put that data in we don't like we haven't added any additional context other than, um, in our like, prompt settings in Azure, OpenAI studio. Um, that's kind of like telling it, use emojis. Get real, you know, hot for handsome Dan, um, you talk about New Haven pizza and like some things that that actually help give it good answers, um, in addition to those fun things.

But that is, that is something else that like, that's one of those things that whenever it comes to like data for a chat bot, um, that I've talked to a lot of people about is that there is between the user interface and the data that you have, like there is a giant stretch that could be here where you're adding like some salt and pepper to the meal to like make it better. And so pulling in some publicly available data might be great too. Yeah.

>>:
Yes. With your experience of prompt engineering, will it be easier to have all of these small vector databases with a containerized and having the chat bots or the AI pointing to each other? Or would it be easier to have a larger database and have one AI? Yeah, so so having multiple vectorized databases per site versus having like one large database, um, I think what's so the answer is uncertain. You know, like you shake the eight ball, you get the like future uncertain check again later. The reality is, is that it's probably going to be a case of like having that data set up and vectorized databases per purpose. So like having a vectorized database per website. But then maybe there are websites that are related and those are put into an aggregated database, you know, or maybe there is information from one from all the sites that gets put into the vectorized database. So for instance, um, thinking about events, uh, so at Yale, they've got a whole system for events. I'm setting that aside.

I'm just imagining, like if we had hundreds of sites that were all posting their own Drupal notes for events like how would you how would you handle that with an AI search or an AI tool? Well, you could pull in just the events data and start to like, turn that into a database, a vectorized database, and interact with that so you can slice it by. I like, you know, by website and you can also pull it in by node type. You could also because we're using YaleSites. Um, we can add meta structured metadata to like how we're categorizing content and use that at that meta layer to then like pull in aggregate data from a variety of sources. Um, this is this is stuff that gets me super excited about AI is being able to use all of this data in ways that make it easier for people. Like the reason that I'm a creative director is I want to make the world, like, easy for people to use. You know, I want the digital world to be as easy as as the real world. Like gravity. JS never has a bug, you know you're not going to fall off the planet.

So, uh, well, I mean, maybe, you know, it depends on how high up you are anyway, so I hope that answered your question. Yes. All right. Other questions. Yes. What is, um. What is the like lift for adding this to? At this point. The additional lift so far has just been us refining the process, creating that Drupal module that allows us to consume Drupal data and export it to a vectorized database. Having the vectorized database set up in a way that that we know makes sense. Um, having the the history and knowledge to know what temperature to set prompt responses to. Um, once you get kind of like over that big Hill implementation, it's pretty easy. Um, it's just a case of wiring like Drupal data A to vectorized database B, um, and at that point, the most complicated part becomes the prompt engineering. Uh, because maybe you want to have a bombastic emoji personality, you know, responding to things. But if you're the Yale School of Law, you're not going to want emoji in your responses. And so you have to do prompt engineering for that.

And you also have to do red teaming to make sure that, like the data that you have isn't going to return anything bad. Um, the, the red teaming aspect, trying to find problems, um, is particular to the data sets. Um, in fact, we're running exercises with Yale teams who are who are getting asked Yale on their site. Um, we are putting it in their hands because they are subject matter experts to help us red team, these things we actually have like a grid of like priorities. And we're like, hey, try and break it this way and try and break it that way. See if their responses are good. And as subject matter experts, you're the ones who have to answer that. Because, you know, we're not law students. We're not school of Art students. We're not, you know, Yale hospitality. Yes. Is there is.

>>:
There a middleware. That takes. Some particular data source and pipes to. I don't know the answer to that question. Yes, I'm terribly sorry. You got the creative director and not the engineer for mid-cap. No. That's fine. Could you speak to how the data gets from one? Honestly, magic is the way that that that I see it. So I'm the Drupal module. Like I said, um, you know, kind of covers that data import, the one that that YaleSites is working on. I can't answer more specific questions than that. I'm terribly sorry. I, I wish I could I'm nerdy, but I'm not that kind of nerd. All right. How are we doing on time? We're at. Time. We're at time. All right, I'll answer one more question. Okay. Uh. If somebody would like to request is there another tabletop RPG, they can play them? I'm sorry. Repeat. The question for me is. Something like Heroquest is there a. Game that you would point them towards? So the board game Heroquest. All right, so great tabletop question. I love ending on this. Um, so if somebody likes the tabletop tabletop game board game Heroquest, what would I point them to?

Um, honestly, one of the the best ways to get people to engage with the tabletop RPG, moving them from board games into tabletop RPGs is to find a game that has the right vibe you like. You know, a lot of people will say like, oh, Dungeons and Dragons is the obvious answer, and maybe it is, but there's also, um, there's Dungeon World, which is a little bit older, and I know a little bit can be problematic because of one of the creators. Um, but there are a lot of other, like, other things in that zone, like, for instance, a game that I like that is, um, uh, sort of like Dungeon World. It's called the Watch. Um, but it's, it's a very like woke tabletop RPG. Um, and so like the question is, is you want to sync up the vibe, um, and make people excited about, like, the theme of it as opposed to maybe the particular, like, technical levels. Um, one of the games that the game that I run for the teens at the library is called blades in the dark, where everyone is scoundrels, like in like an 1800 style city where the sun never shines.

Um, and I find that, like, um, all of the teens who show up with Dungeons and Dragons characters are surprised. But by the end of the session, they've had a great time because there's a good vibe. And the rules are are not super intense. Very cool. Great. All right, now everyone has homework. Go watch For All Mankind on Apple TV. So thank you all so much.

MidCamp 2024

Revolutionizing Conversations: AI Chat Integration on YaleSites

Description

Randy Oest

DePaul University - Lincoln Park Student Center

Thank You to our Core Sponsors