Tuesday, January 17, 2012

Iris - siri like software

Recently I have started a pet project Iris1 while doing background study for my recent paper.

Voice commands/response:
Like Siri, it takes user commands and can respond using either Voice or Text.


Portable:
Since it is written in Java, it can run on most OS, including Mac, Windows and Linux. I have only tested it on Mac Lion, so I welcome any beta testers using other OS. For now, some of the features like play itunes, open email, etc are only available on Mac. I plan to add these features on other OS as well.

Configurable:
Most of the features such as use of VOICE or TEXT as feedback mechanism, as well as feedback message are configurable through the configuration file ''user_scripts/iris.config". You can also modify any of the predefined apple scripts (for eg: playSong.scpt) in that folder which will be called by Iris on the voice cammand. Also, for advance users, there are ten function commands that he/she can use to perform user-specific tasks.

Easy to install and use:
All you need to do is download and extract the binary file and run following command:
 
java -Xmx200m -jar bin/Iris.jar
The above command asks Java to run the program Iris.jar (in bin folder) and also specifies that it should not use more than 200 MB of memory.

Lightweight:
The whole setup takes about 20MB.

Client-side:
The speech-to-text is done on your laptop/desktop and not on some remote server. This means you can use it even if have no internet connection. This also means that there is no possibility of some big brother monitoring your voice commands.

Open source:
You are free to download and modify the source. As I am working towards a paper deadline, I am looking for collaborators that would help me with following features:
  1. User-friendly: Improving voice commands
  2. More accuracy: Expanding dictionary to include more words and also to add different 'user accents' for a given word.
  3. Expanding grammar to include arbitrary spoken words. 
  4. Apple scripts to accessing contacts and recognizing contact names to assist users with various functionality.
  5. Adding GUI interface rather than command line. Since all the responses go through OutputUtility.java, this is going to be easy.
  6. New features: Support for RSS, Top News, Weather, Popular city names, Dictionary/Theasurus, Web/Wikipedia search, Dictation (to text/messages/events/todo item/alarm/reminders), Jokes, Motivational quotes, More Personalization.
If you have suggestions for improvement or want to collaborate with me, email me at 'NiketanPansare AT GMail'.

Iris uses CMU's Sphinx4 for recognizing speech and FreeTTS for text-to-speech conversion.

References:
1 Iris is name of my friend's dog (well, no pun intended when I use the term 'pet project'). It is also the anagram of the word 'Siri' (a popular Apple's software for IPhone 4s).

2 comments:

Suchi said...

That is so cool...

Why it does not speak back to me? Just like the way in your video... guess, doesn't like my accent and does not want to talk to me :(

Niketan said...

Thanks Suchi for testing the software.

Did you use predefined commands: "What time is it", "what time is it in south carolina", "play music", see README for complete list.

As of now, it is supposed to remain silent when:
- It recognizes what u said but is not a predefined command
- OR if it is unable to recognize you said

Don't worry about you accent. Just speak naturally.

The next version is right around the corner ... It will be much more user-friendly and powerful.