Sunday, April 21, 2013

Flat UI Update to GUI

Our major HA user interfaces are web based, and when I see prepackaged UI elements, I think about how I can use them to improve what I have. We mainly use our custom floorplan GUI, but occasionally use some older interfaces. One of them is below, design circa 1999 :) As you can see, the artistic side of my brain isn't very developed.



It's actually the 2nd generation - the first was ASP and IIS based. The 2nd version didn't change the UI at all, just the implementation, which I switched to Perl CGI and Apache. I ran across a Flat UI package some time ago. Flat, from what I've gathered, is how many of the newer, hip websites are designed. I figured I'd give it a try and bring some of its design elements into our UI. As a tutorial, I've redone the above lighting UI with toggles and sliders from Flat UI. You can see how much nicer it looks in this short clip:



It's also more finger friendly for tablet use. I made my own tweaks to fit in my existing template, but I still need to move things around a litle (slide the toggles down a few pixels, maybe reduce the open space). It was relatively painless (I already know enough JavaScript and JQuery to figure out most of the kit), and I can see using it in other interfaces.

Tuesday, April 16, 2013

Using SL4A and Android Speech Recognition for Home Automation

My latest project has been experimenting with SL4A (Scripting Language For Android) and Python on my Galaxy Note II. I started with the included saychat.py sample to build a simple script that kicks off Android speech recognition. It takes the text result returned from Google and sends it over IM to our HA server. The HA server does some basic natural language processing on the text, extracting commands and performing the operations if any valid ones are found. It then returns a response over IM to the phone with the result of the command(s). Back on the phone, the Python script has been waiting for this confirmation and uses TTS to read it back. The cycle repeats until the user says "goodbye" or it gets two consecutive recognition results with no speaker. Here's a short YouTube video of it in action:



From the video, you can see I've tried to parse the speech so that it can find the commands and devices even if the command is spoken differently. I used three different phrases:

  • "Can you turn on the kitchen light and dining room light?"
  • "Can you turn off the lights in the kitchen?"
  • "Turn off the dining room light."

    I was trying to avoid having only simplistic commands like the last one. The first one demonstrates that ability to speak a command for multiple devices, and the ability to preface the command with "Can you" or pretty much anything like "The dog wants you to" ;) The 2nd command shows that it's not restricted to parsing "kitchen light" together. The last command is a typical HA VR command. My parser also has the ability to decode multiple commands in one, such as "Turn off the kitchen light, the guest bath fan and living room light and turn on the back floods." The only challenge is saying everything you want to say without much of a pause, otherwise recognition stops and the partial command will be sent.

    A few advantages of using this setup:

  • Google's speech recognition in the cloud is probably the best, most up-to-date system. They started building up their system with the now closed GOOG-411 service. Further fine tuning gets done on the millions of voicemails their Google Voice service transcribes. Their Chrome browser also uses their speech recognition and of course, so do the millions of Android users. All this input goes into tuning their accuracy, and what you end up with is one of the best performing, up to date speech recognition systems. If you're using Microsoft's Windows VR, you're probably getting something that gets updated every few years with each OS release - if you're upgrading. With HAL, you've getting a 1990s VR engine. I'm not even sure if that gets updated anymore.
  • Google's free form speech recognition allows the most flexibility in speaking commands. Granted, that makes the parsing more difficult, but it allows a system that can more accurately respond to the different ways different people phrase commands. Most speech recognition engines I've worked with require you to pre-program canned phrases in order to recognize commands. If you deviate just a little from what's programmed, good luck getting your command recognized.
  • By using Jabber IM as a transport mechanism for the recognized commands, the same system that works at home, works when you're away. You just turn on your mobile data - there's no VPN or SSH tunnels to set up every time you want to speak a command. There's one level of security for free since your home's IM client must have pre-approved other users to allow communication (adding them to the roster). Another level can be done at the scripting layer of your HA software, by limiting what IM users can issue certain commands. For extra security, you can even encode or encrypt the text being sent over IM if you want, but if you're using Google Talk servers, your communication is already wrapped in SSL.

    A few more details. Using SL4A, I cannot control the default speech recognition sounds - it can get annoying after a while. I'm using Nova Launcher as my launcher instead of TouchWiz. Nova Launcher let's you remap the home key behavior on the home screen. When pressed, instead of showing the zoomed out view of all my screens, it kicks off the script. Also, my HA device database is stored in mySQL, which allows for powerful searches and easy matching of what's spoken to actual devices - even when the device name isn't exactly the same as what was spoken. I've been using the mySQL setup, IM interface and command parsing for many years now, (although the parsing was more primitive) so integration was extremely simple. At some point, I would like to implement NLTK, the Natural Language ToolKit, for more complex language processing.


  • Wednesday, April 3, 2013

    2012 Most Downloaded (3 months late)

    Here is our list of popular downloads for 2012:

    1. EventGhost xPL Plugin - 99 times
    2. xScript - 77
    3. BlueTracker - 61
    4. xPLGVoice - 39
    5. xPLSerial - 17
    5. BlueTrackerScript - 17
    7. xPLGCal - 16
    8. t2mp3 - 15
    8. Blabber - 15
    8. Noise - 15
    8. xPLChumby - 15

    The EG plugin continues to be popular despite our stopping development on it years ago. Four of the top five and their downloads are almost exactly the same as last year, with xPLChumby dropping and xPLGVoice getting more interest. There are no new apps on the list since I did virtually no HA last year, but it's nice to see there's still a similar amount of interest in our existing apps. I haven't had the urge to code anything new, let alone time to brainstorm new things. We'll see what 2013 brings...at least I'm blogging again.