Programmable friends

--

Twilio, Google Cloud Vision and OpenAI Friend Chat

GitHub Repo

Video demo

It’s like a mocking bird, not really there but will distract you

Can a computer understand a meme?

Open source memes hit different

Started writing 01/05/2023 9:20 PM

Working code by 01/07/2023 9:16 PM

Disclaimers

  • this project costs money
  • you are piping your data to several services, think about what you’re sending

Background/reason

So I’m a person that’s in front of a computer pretty much all the time if I’m not asleep or doing something else. I write code for work or for myself. I’m also the type of person that immediately responds to text messages and what not… ergo no life. A (messed up) thought crossed my mind… if you never really hear or see a person’s face, can you replace them with a “friend chat bot”. As an effort to spam my friends less I’m going to try this out (message this bot more/friends less).

I know that this doesn’t actually understand you, but it’s a fun little project.

Platform

I have been using Twilio’s programmable SMS service for years now. I use Google’s Android messages, so I have that tab open on my desktop all the time. I send text messages through that. This “friend bot” will be one of those entries. I’ll rent a new Twilio number for it. It will accept memes (photos, videos), use an image classifier and then use OpenAI to do the responses. I finally signed up for their service.

So instead of writing a friend, I’ll write to one of these Twilio numbers

Do the work — started at 11:52 AM random order

  • setup/use a server — time completed
  • rent a new Twilio number — time completed
  • receive media in Twilio and forward — time completed
  • get frame from video (ffmpeg) —12:18 PM
  • send image/frame from video into a visual classifier —1:56 PM
  • send prompt into OpenAI Friend Chat— 12:45 PM
  • message me back — time completed
  • article parsing/summarizing (extra feature)
  • respond to me randomly (extra feature)

Yeah I forgot to fill these out

I’ve done a lot of this work already in past through other projects although it still takes time to stitch it all together. This is just a code-gluing project. I’ll do this work on a weekend (fresh brain).

Tech stack

  • Linux VPS on OVH with NGINX/Let’s Encrypt/PM2, NodeJS, Twilio NodeJS API, ffmpeg
  • Google Cloud Vision API (Object Localization)
  • OpenAI Friend Chat NodeJS API

This is not free to use btw… the cost will be around $6/mo + $0.026 (1 image, 2 sms) + free localization (Google Cloud Vision) before hitting 1000 requests in a month, not sure on OpenAI cost.

I will time myself to see how long it takes me to build this. Took me too long around 8–9 hours. Should have been under 5.

OpenAI Friend Chat

I’m using their default friend chat playground example code… it looks like you follow this prompt pattern of I say this, you say that.

So when I send in my plain text message (SMS) or the image/video context, I would format it like the pattern above.

Development brain dump (boring cringe content below)

(raw stream of consciousness)

Post build writeup

So this took me much longer than I wanted. Mostly due to dumb errors I kept running into and some lag from developing/testing in prod.

But it works, some images Google Cloud Vision doesn’t figure out, and there the response is vague like a person not sure what they just received.

I’m spent now so I’m just closing this up, will record a video to demo it.

I’ll post a follow up in the future to see what mods I do/how this works out over time.

Oh it looks like if the prompt responds with a question, it sometimes answers its own question, interesting. I’ll have to look into that.

Cat tax

Updates

01/11/2022 this thing works but I don’t really want to use it, because it goes to a random server, it’s not something you can “trust” like a friend. I put trust in quotes because the medium of communication is not secure with your friends, so arguable… but yeah.

--

--

Jacob David C. Cunningham

Software developer and general technology tinkerer