Smile in the camera, then you’ll hear a random quote like this:
When you really don’t feel like smiling, it will try to comfort you:
Photo credit: Torrey Wiley
Voice source: Amazon Polly, Justin.
If I have Image Object Recognition and Text-to-speech technology, how to create an interesting mobile experience around these capabilities?
It would feel unnatural if the robot just keep saying “Your smile is beautiful”
- Solution: Reference the famous saying in history:
What if the user just cannot smile at that moment?
- Instead of punishing someone who don’t smile, I think it should be better to show some level of empathy.
Why using text-to-speech instead of pre-recorded voice?
Text-to-speech can say anything you want, say it a thousand times without effort.
Video vs image
At last chose uploading an image instead of spying the video stream.
- Time of API response: ~2s.
- Google Cloud Vision: $1 per 1000 units.
Live demo: https://www.sunwangshu.com/mantra-mirror
Sample voice when you smile (Amazon Polly, Justin):
Sample voice when you show a sad face:
- $$: Mental health industry
- Oct. 2017, smile. Emotion diary.
- Social? This experience is rather private for now, otherwise it could be made into a party game, like “try not to laugh first“
Unused Ideas: Language
An app to help you say foreign language by inputing images.
- Image-> text, something you cannot see.
Pronunciation of foreign language
- Text -> image, something you cannot say.
Reasons not using:
- There already exists similar apps like OCR and Google translate.
- It’s best to use scanned image and printed text for OCR, which is not so convenient for a mobile phone.
Unused Ideas: Extended Vision
Theme: Extended Vision for Driving
An app that can hint about approaching cars behind you when driving.
Something at your back
- Image-> text, things at your back is hard to see. Even though cars have back mirrors, the drivers have to pay constant attention to it, while looking front at the same time, the cognitive load is high.
- Text-> speech. The information on the phone is also hard to see, so it is best given via voice.
Reasons not using:
- Why using a mobile phone in a driving context?
- Doubtful if technology is ripe for yielding distance information with only one camera on the phone. 2D object recognition is known to be not as precise as 3D object recognition. If instructions given are not helpful, it could be instead additional distractions and could lead to disaster.