Twilio, Google Cloud Vision and OpenAI Friend Chat
Can a computer understand a meme?
Started writing 01/05/2023 9:20 PM
Working code by 01/07/2023 9:16 PM
- this project costs money
- you are piping your data to several services, think about what you’re sending
So I’m a person that’s in front of a computer pretty much all the time if I’m not asleep or doing something else. I write code for work or for myself. I’m also the type of person that immediately responds to text messages and what not… ergo no life. A (messed up) thought crossed my mind… if you never really hear or see a person’s face, can you replace them with a “friend chat bot”. As an effort to spam my friends less I’m going to try this out (message this bot more/friends less).
I know that this doesn’t actually understand you, but it’s a fun little project.
I have been using Twilio’s programmable SMS service for years now. I use Google’s Android messages, so I have that tab open on my desktop all the time. I send text messages through that. This “friend bot” will be one of those entries. I’ll rent a new Twilio number for it. It will accept memes (photos, videos), use an image classifier and then use OpenAI to do the responses. I finally signed up for their service.
Do the work — started at 11:52 AM random order
- setup/use a server — time completed
- rent a new Twilio number — time completed
- receive media in Twilio and forward — time completed
- get frame from video (ffmpeg) —12:18 PM
- send image/frame from video into a visual classifier —1:56 PM
- send prompt into OpenAI Friend Chat— 12:45 PM
- message me back — time completed
- article parsing/summarizing (extra feature)
- respond to me randomly (extra feature)
Yeah I forgot to fill these out
I’ve done a lot of this work already in past through other projects although it still takes time to stitch it all together. This is just a code-gluing project. I’ll do this work on a weekend (fresh brain).
- Linux VPS on OVH with NGINX/Let’s Encrypt/PM2, NodeJS, Twilio NodeJS API, ffmpeg
- Google Cloud Vision API (Object Localization)
- OpenAI Friend Chat NodeJS API
This is not free to use btw… the cost will be around $6/mo + $0.026 (1 image, 2 sms) + free localization (Google Cloud Vision) before hitting 1000 requests in a month, not sure on OpenAI cost.
I will time myself to see how long it takes me to build this. Took me too long around 8–9 hours. Should have been under 5.
OpenAI Friend Chat
I’m using their default friend chat playground example code… it looks like you follow this prompt pattern of I say this, you say that.
So when I send in my plain text message (SMS) or the image/video context, I would format it like the pattern above.
Development brain dump (boring cringe content below)
(raw stream of consciousness)
programmable-friends/devlog at with-devlogs · jdc-cunningham/programmable-friends
Sorry the format of this is f'd, I need a medium to MD exporter Medium copy-paste below with fixed images 01/07/2023…
Post build writeup
So this took me much longer than I wanted. Mostly due to dumb errors I kept running into and some lag from developing/testing in prod.
But it works, some images Google Cloud Vision doesn’t figure out, and there the response is vague like a person not sure what they just received.
I’m spent now so I’m just closing this up, will record a video to demo it.
I’ll post a follow up in the future to see what mods I do/how this works out over time.
Oh it looks like if the prompt responds with a question, it sometimes answers its own question, interesting. I’ll have to look into that.
01/11/2022 this thing works but I don’t really want to use it, because it goes to a random server, it’s not something you can “trust” like a friend. I put trust in quotes because the medium of communication is not secure with your friends, so arguable… but yeah.