Tuesday, May 28, 2013

Speech Recognition of Numbers for Timed Events

One of the knocks against using speech recognition for home automation is that you could have used a different method, like a touchscreen or remote, for more efficient control. However, timed events are one of those use cases that I find better with voice control. Let's say I want to turn on a light for 20 minutes. With a GUI, I have to select the light and then select a mode (delayed - turn on after X seconds, interval - turn on for X minutes then turn off). To select the delay/duration, I would need some sliders, text boxes to type the time or maybe a drop down list of predefined times. If I want to schedule with days or dates, then I would need a date picker/calendar as well. Or, I could just say "At 6PM on Sunday, turn on the porch light for 15 minutes."

Continuing with my recent experiments with Android speech recognition, I began adding timed events. One advantage of a free form speech recognition engine like Google's, is the ability to recognize any number that's spoken. You're not limited to a set of predefined options, like with Homeseer:
<1|2|3|4|5|6|7|8|9|10|11|12|13|14|15|16|17|18|19|20|30|40|50|60|70|80|90> <seconds|second|minutes|minute|hours|hour|days|day>
Kind of a nitpick, but what if you want 25 minutes? Can't do it, but Google will recognize whatever you say, whether it's 3 fortnights or 2,567 seconds. It may not be necessary, but it gives you the flexibility to do whatever you want. It's up to your software to parse out the numbers and units. With Python, it's simple to recognize the sentence for a particular pattern and extract the necessary parameters. Tthe following code shows how to extract information for basic delayed/duration type of events.
regex_delay = re.compile('[^\s+]*(in|for|after)\s+(\d+)\s+(day|hour|minute|second)[s]*')
if regex_delay.search(msg):
  delay_parm = re.findall(regex_delay, msg)
delay_parm will now contain a list of groups. If your command is "turn off the garage light after 15 seconds", you'll get this:
delay_parm = [('after', '15', 'second')]
delay_parm[0][0] = after  # delay type
delay_parm[0][1] = 15     # delay value
delay_parm[0][2] = second # delay unit
Now you have all the information you need to perform the action:
# convert to common unit, seconds
if delay_parm[0][2] == "day":
  delay_time = int(delay_parm[0][1]) * 60 * 60 * 24
elif delay_parm[0][2] == "hour":
  delay_time = int(delay_parm[0][1]) * 60 * 60
elif delay_parm[0][1] == "minute":
  delay_time = int(delay_parm[0][1]) * 60
  delay_time = int(delay_parm[0][1])

if delay_parm[0] == "for":
Scheduling an event based on a day ("next Tuesday"), a time ("at 3PM") or date ("December 31, 2014") is just an extension of this. Take a look at this demo where I'm showing a time based reminder and a delayed lighting event.

Saturday, May 18, 2013

Generic Speech Recognition Script for SL4A

I've finally had some time to make a generic speech recognition script that hopefully any SL4A capable Android device can use. I've taken parts of my script from the previous post and added in some sample pattern matching from my server side script. The result is a script that can issue commands to your home automation controller or software by fetching URLs. A couple prerequisites: you need you must have SL4A and Python for Android installed on your device. It would help to be familiar with some Python and its regular expression syntax. The code is well commented and has samples for recognizing phrases like "turn off the kitchen light" and "turn the master lights off" - so hopefully that's enough to kickstart your automating. So go ahead and get it!

Saturday, May 11, 2013

More Two Way Interaction With Android Speech Recognition

Let's start with the demo first. The video shows commands and queries being spoken and recognized by Android speech recognition. Our web GUI is also in the shot - since the wife forbids me to walk around filming the insides of our house for the world to see :) - so you can at least "see" some of the status being queried and the results of some actions. There are some annotations on the video, but you can't see them on the embedded player. Click through to YouTube to see the video with annotations.

What makes all that stuff work is the queries are passed to a server for processing. That gives the opportunity for two way interactions where you can not only control your system but query it as well. As I mentioned in my previous post on this topic, using IM as a transport mechanism allows the recognized phrase to be sent to the server and the responses sent back to the Android device. Over on our server, EVERY device and its state is logged in our MySQL database. This was done when we built our AJAX based GUI. Also, since our system is distributed, MySQL provides a place for status to be updated and synced between various devices. Below is a snapshot of a phpMyAdmin page showing part of one of our database tables. The table contains the device name, type, its state and when it was turned on and off.

Every device and its state is stored: every light, appliance, AV device, motion sensor, door, window, lock, car, phone, computer, etc. Whenever a device's state changes, a function gets triggered in whatever software is interfacing to that device (which is for the most part Windows scripting like the jScript below):

function setStatusOnOff(device,type,state,secs) {
    try {
        if (state=="off") {
            mysqlrs.Open("insert into status (device,type,state,secs_off) values ('"+device+"','"+type+"','"+state+"','"+secs+"') on duplicate key update state='"+state+"', secs_off='"+secs+"'",mysql);
        } else {
            mysqlrs.Open("insert into status (device,type,state,secs) values ('"+device+"','"+type+"','"+state+"','"+secs+"') on duplicate key update state='"+state+"', secs='"+secs+"'",mysql);

Since the device name is stored as a normal non-abbreviated name ("family room tv" instead of "frtv"), it's straightforward to use the recognized speech to search for devices using MySQL queries. The next step is to figure out what type of command is being issued. For example, a command will have the phrase "turn on" or "turn off" in it. Since I use Python on the server to process the speech, I use its regular expression (regex) functions to pattern match for commands:

reTurn = re.compile('(^|\s+)turn.*\s+o(n|(f[f]*))($|\s+)',re.I) # recognize "turn on", "turn this and that on", "turn this and that blah blah blah off" or just "turn off", even "turn of" anywhere in a sentence

After figuring out if it's a command or query, my script then strips out extraneous text to simplify extracting the device and type. What gets stripped out depends on how things are phrased in your household. Here's a snippet to do that, where msg is the recognized phrase:

msg = re.sub('(^|\s)+(turn)|(the)|(a)|(can)|(you)|(please)|(will)\s+',' ',msg) # strip out unneeded text

I'm experimenting with natural language processing to strip out unnecessary words automatically but it's not ready yet. Next, the script figures out the type of device involved. For lighting, it would use a regex similar to this:

reLight = re.compile('\s+(light[s]*)|(lamp[s]*)|(chandalier[s]*)|(halogen[s]*)|(sconce[s]*)($|\s?)',re.I)

Since all the extra words have been stripped out and the type has been determined, all that's left is to formulate a MySQL query like this to get the actual device name:

msg = re.sub("\s+","%",msg) # replace spaces with wildcard character %
if reLight.search(msg):
  query="select * from lighting where device like '%"+msg+"%'"

This is necessary to remove ambiguities in the recognition. A light may be named "guestbath" in the HA system, but Google may pass the recognized phrase as "guest bath." With the actual device name, the final steps are to issue the command and send a response back to the Android device. As lights and other devices are added to HA system, nothing else needs to be added. Contrast that with other automation systems where you have to setup a recognition phrase for every device and possibly every state in your system. In our system, new device names will be parsed out of the database, and no changes are required on the Android device. Queries also follow a similar flow, except instead of issuing a command, a response is formulated with the status and sent back to the user.

That's the backend. I'll cover the frontend in another post.

Thursday, May 9, 2013

Using CanvasJS to Graph Power Consumption

We've been using RRDtool for graphing everything from temperatures, to disk usage, to power consumption. It's very powerful and makes some nice charts, but I can never remember how to set up the database. Plus, my server is constantly running the tool to generate the graphs every 15 minutes so it's relatively up to date when someone views them. I'm now playing with CanvasJS which uses HTML5 and JavaScript to easily generate some really cool graphs. I'm using power consumption as my test bed for implementing CanvasJS. Data for the power consumption is dumped into our MySQL database every 2 minutes (it's actually coming in every second, but I'm only sampling the data every 2 minutes for this graphing application). With some JavaScript and PHP pulling the data out of MySQL, the charts are generated on the fly. It works REALLY well and it's fast. You can pan and zoom the chart to see the exact power usage at a specific time. Check out the gallery for more samples with code. I will probably transition all our system's graphing over to CanvasJS, after I have more time to experiment. In the meantime, here's a short video showing the power consumption graphs I'm working with.