Smile in the camera, then you’ll hear a random quote like this:
When you really don’t feel like smiling, it will try to comfort you:
Photo credit: Torrey Wiley
Voice source: Amazon Polly, Justin.
Introduction
If I have Image Object Recognition and Text-to-speech technology, how to create an interesting mobile experience around these capabilities?
Within one week, I brainstormed about the possibilities to combine their strengths and appeal to most users, and created a functional prototype with HTML/CSS/Javascript, with Microsoft Emotion API and Amazon Polly.
Ideation
Thoughts
It would feel unnatural if the robot just keep saying “Your smile is beautiful”
- Solution: Reference the famous saying in history:
What if the user just cannot smile at that moment?
- Instead of punishing someone who don’t smile, I think it should be better to show some level of empathy.
Why using text-to-speech instead of pre-recorded voice?
Text-to-speech can say anything you want, say it a thousand times without effort.
Video vs image
At last chose uploading an image instead of spying the video stream.
Reason: cost.
- Time of API response: ~2s.
- Google Cloud Vision: $1 per 1000 units.
UI design
Implementation
Based on https://codepen.io/matt-west/pen/wGzuJ
Live Prototype
Live demo: https://www.sunwangshu.com/mantra-mirror
Sample voice when you smile (Amazon Polly, Justin):
Sample voice when you show a sad face:
Further thoughts:
- $$: Mental health industry
- Oct. 2017, smile. Emotion diary.
- Social? This experience is rather private for now, otherwise it could be made into a party game, like “try not to laugh first“
Unused Ideas: Language
Theme: Language
An app to help you say foreign language by inputing images.
Foreign Language
- Image-> text, something you cannot see.
Pronunciation of foreign language
- Text -> image, something you cannot say.
Reasons not using:
- There already exists similar apps like OCR and Google translate.
- It’s best to use scanned image and printed text for OCR, which is not so convenient for a mobile phone.
Unused Ideas: Extended Vision
Theme: Extended Vision for Driving
An app that can hint about approaching cars behind you when driving.
Something at your back
- Image-> text, things at your back is hard to see. Even though cars have back mirrors, the drivers have to pay constant attention to it, while looking front at the same time, the cognitive load is high.
Voice assistance
- Text-> speech. The information on the phone is also hard to see, so it is best given via voice.
Reasons not using:
- Why using a mobile phone in a driving context?
- Doubtful if technology is ripe for yielding distance information with only one camera on the phone. 2D object recognition is known to be not as precise as 3D object recognition. If instructions given are not helpful, it could be instead additional distractions and could lead to disaster.