Following the discussions in the Snips Community Chat confirms my feeling that there is still a lot of misunderstanding about how Snips actually works. There are repeated questions as to why, despite the successful installation of Snips and deployment of an assistant and despite successful speech recognition, no speech output is given when asked, for example, "Hey Snips. How will the weather be tomorrow?". Difficulties like these are about 50% of all questions in the forum. So a few days ago I sent the following tweet to address this topic.

I just wanted to point out a typical beginner's mistake. But especially from the "Snips Power Users" came to objection that they almost exclusively use skills without defined skill actions and that Snips does indeed generate a respone. And now? What's the right thing to do? In fact, there are several ways to get Snips to perform a voice output or other action. Let's see if I can use this blog post to get a little more light on how to trigger actions in Snips.

How do these problems manifest in practice?

Look at the following log output, which you can get running sam watch before you start the voice input.

[17:40:08] [Hotword] was asked to toggle itself 'off' on site default
[17:40:08] [Dialogue] session with id '48bef6a7-a316-4ecf-be9d-e273d82bd29f' was started on site default
[17:40:08] [Asr] was asked to listen on site default
[17:40:13] [Asr] captured text "what´s the weather" in 5.0s
[17:40:13] [Asr] was asked to stop listening on site default
[17:40:13] [Nlu] was asked to parse input what´s the weather
[17:40:13] [Nlu] detected intent searchWeatherForecast with probability 0.993 for input "what´s the weather"
[17:40:13] [Dialogue] New intent detected searchWeatherForecast with probability 0.993
[17:40:19] [Dialogue] session with id '48bef6a7-a316-4ecf-be9d-e273d82bd29f' was ended on site default. The session was ended because one of the component didn´t respond in a timely manner
[17:40:19] [Asr] was asked to stop listening on site default
[17:40:19] [Hotword] was asked to toggle itself 'on' on site default

Everything looks good and without errors at first glance:

  • the Hotword has been recognized
  • the DialogManager has started a session
  • ASR has captured the correct input
  • NLU has matched the correct intent

And yet shortly afterwards the session was ended without the expected voice output. So let's have a look at what services are running on the Snips device by running sam status. Maybe this way we can narrow down the problem.

$ sam status

Connected to device

OS version ................... Raspbian GNU/Linux 9 (stretch)
Installed assistant .......... Snips
Language ..................... de
Hotword ...................... hey_snips
ASR engine ................... snips
Status ....................... Live

Service status:

snips-analytics .............. 0.55.2 (running)
snips-asr .................... 0.55.2 (running)
snips-audio-server ........... 0.55.2 (running)
snips-dialogue ............... 0.55.2 (running)
snips-hotword ................ 0.55.2 (running)
snips-nlu .................... 0.55.2 (running)
snips-skill-server ........... 0.55.2 (not running)
snips-tts .................... 0.55.2 (running)

Here we quickly see the cause. The snips-skill-server is not running since no skills have been installed yet - accordingly, no action was executed after the intent has been corrently identified.

Why do some newbies fall into this trap?

I think there are essentially 2 reasons for this. The first reason is that there has been a significant change in the implementation of Skill Actions with version 0.55.x. This can be solved by better and up-to-date documentation. The Snips team is doing a good job here. There is also a guide for migrating old skill actions to the current flavor.
In my opinion, the main reason for this is that Snips works in a completely different way. For example, if you install or activate an Amazon Alexa skill, this skill is ready to use in most cases. A skill in the Snips universe does not necessarily have to have an implementation. The focus of a skill is on defining intents and slots so that the ASR (Automatic Speech Recognition) component behind the scenes can do the best possible job because the underlying model is well-trained. The user assembles his assistant based on the favored skills, deploys that assistant and simply expects an audio output. But the last point is the responsibility of every user - or you use skills that have already defined skill actions. And you can recognize this by the small, red marked icon in the screenshot of my tweet above.

Well, here are listed 5 options to trigger actions in Snips - where an action can be voice output (TTS) or switching lamps on and off, or anything else. If you are one of the users who have fallen into this trap, then you should urgently read on.

1. Code Snippets

This is probably the easiest way to trigger an action for a matched intent. Code Snippets are entered directly in the Snips console. The source code is therefore also part of the deployed assistant, which makes an update more extensive, since the entire assistant must be updated. When using Code Snippets the programming language is limited to Python 2 and you don't have the full power of the programming language at hand, because a code snippet is limited to the implementation of the action body . This means that there is no possibility to import further modules or to introduce further methods to structure your code. Access to a database or any other remote server is therefore hardly possible. But in most cases it is enough to trigger a voice output. Simply copy&paste this two-liner into the Code Snippet section in the Snips Console and run sam install assistant and sam install skills aftwards to deploy your assistant and skills to the device.

current_session_id = intentMessage.session_id
hermes.publish_end_session(current_session_id, "Hello World")

2. Github Repository

As soon as you want to write more extensive and complex code, you will reach the limits of Code Snippets. If this is the case, you should switch to the Github Repository option in the Snips Console. Of course, using a git repository gives you much more flexibility in structuring the source code. The only requirement is that an executable file exists in the root of the repository that is responsible for installing the required dependencies of the skill. In the case of Python, this is typically done using a virtual Python environment (virtualenv) per skill. For each intent, an executable file with the pattern action-*.py must exist in the root. In the case of HelloWorld skill the only intent is named hello and therefore we name the Python executable Run sam install assistant and sam install skills aftwards to deploy your assistant and skills to the device.

Please have a look at which provides the implementation of HelloWorld. Nothing fancy - but it is good enough to demonstrate the functionality.

As a side note, I would like to say that you can link to any other Git repository, even if it is not yet officially supported. It does not necessarily have to be on Github. For example, I've been using Gitlab for quite some time.

3. sam install skills --git

The third option is very similar to the second. The main difference is that the link to the Git repository with the implementation of your skill action is not saved as part of the assistant. However, the contents of these two repositories do not differ, because it is the same repository. This option makes you even more flexible, since you can switch between repositories + branches on the command line quite quickly, for example, for test purposes, without having to re-deploy the assistant.

$ sam install skills --git

4. MQTT for the masses

For the previous three options, we always used the Python programming language. But luckily, the MQTT protocol inside and the design of snips-skill-server make Snips much more flexible. Any programming language or library that can talk with MQTT in version 3.1.1 (in the time of writing) can be used to implement intent actions. So why not use Javascript or Java to implement your handler code? And since MQTT is a server-based protocol, the execution of the intent action does not even have to take place on the Pi where the Snips services are running. As long as you have the permission to connect to the MQTT server, this can happen almost from anywhere.

$ sam install skills --git

The implementation in Javascript uses the Node Page Manager (npm) to install the dependencies. But all other rules remain as described above. You have a to provide an executable and the NodeJs executable is named action-hello.js. Please have a look at which provides the implementation skeleton in JavaScript/NodeJs.

5. Home Assistant

This last option is a somewhat out of the ordinary, as it is more a matter of configuration than of programming. This is a ready-made component for the communication with your Home Assistant installation that you can activate in the Snips Console. Since both products are based on MQTT, the main focus is on the exchange of events between the two systems. However, we will take a closer look at this option in a separate blog post.

As a summary, the following definition of Snips' abilities and tasks fits quite well: Snips does capture the audio, does turn it into text, parses it into intents with slots. You have to make it speak.

Drop me a note if I have forgotten some option.

Lars Martin

I have been developing software for 20 years - mostly on the basis of the JVM. In the recent past I've been doing a methamorphosis towards polyglot projects. In my spare time, I enjoy Smart Home and Home Automation.

Blog Comments powered by Disqus.