APL screen always renders after speak finishes – is async render with long speech possible?

Summary

The Alexa Presentation Language (APL) is used to create visual experiences for Alexa skills on devices with screens, such as the Echo Show. However, the APL screen rendering is deferred until speech completes, causing visual feedback to appear late and making the screen feel secondary. This limitation is particularly noticeable in storytelling use cases where long speech is involved.

Root Cause

The root cause of this issue is the way APL rendering is synchronized with speech. The APL rendering is intentionally deferred until the speech completes to ensure a seamless user experience. The possible causes of this behavior include:

  • Speech and text synchronization: APL rendering is synchronized with speech to ensure that the visual feedback is consistent with the audio output.
  • APL rendering pipeline: The APL rendering pipeline may be designed to prioritize speech over visual rendering, causing the visual feedback to appear late.
  • Device limitations: The Echo Show device may have limitations that prevent APL rendering from occurring simultaneously with speech.

Why This Happens in Real Systems

This issue occurs in real systems because of the way APL is designed to work with speech. The APL rendering is dependent on the speech completion, which means that the visual feedback will always appear after the speech has finished. This can cause problems in use cases where long speech is involved, such as storytelling. The impacts of this issue include:

  • Delayed visual feedback: The visual feedback appears late, making the screen feel secondary and less useful.
  • Limited user experience: The user experience is limited by the delayed visual feedback, which can cause frustration and confusion.
  • Inconsistent user interface: The user interface may appear inconsistent, with the visual feedback not matching the audio output.

Real-World Impact

The real-world impact of this issue is significant, particularly in use cases where long speech is involved. The delayed visual feedback can cause:

  • User frustration: Users may become frustrated with the delayed visual feedback, which can cause them to abandon the skill.
  • Limited engagement: The limited user experience can cause users to engage less with the skill, which can lead to a decrease in usage and retention.
  • Negative reviews: Users may leave negative reviews due to the delayed visual feedback, which can harm the skill’s reputation and visibility.

Example or Code

if (apl.supportsAPL(handlerInput)) {
  return handlerInput.responseBuilder
    .speak('Welcome. Say start to begin!')
    .addDirective({
      type: 'Alexa.Presentation.APL.RenderDocument',
      token: apl.APL_TOKEN,
      document: require('../apl/main.json'),
      datasources: {
        payload: {
          background: apl.getBackground("WELCOME"),
          storyText: '',
          badge: '',
          badgeVisible: 0,
          storyVisible: 0
        }
      }
    })
    .addDirective({
      type: 'Alexa.Presentation.APL.ExecuteCommands',
      token: apl.APL_TOKEN,
      commands: [
        {
          type: 'SetValue',
          componentId: 'welcomeBg',
          property: 'source',
          value: apl.getBackground(sessionAttributes)
        }
      ]
    })
    .reprompt('Say start to continue.')
    .getResponse();
}

How Senior Engineers Fix It

Senior engineers can fix this issue by using progressive responses to render the APL screen while the speech is still playing. This can be achieved by:

  • Breaking up long speech into smaller chunks: Breaking up long speech into smaller chunks can help to reduce the delay in visual feedback.
  • Using ExecuteCommands to update the UI: Using ExecuteCommands to update the UI can help to render the APL screen while the speech is still playing.
  • Optimizing the APL document: Optimizing the APL document can help to reduce the rendering time and improve the overall user experience.

Why Juniors Miss It

Juniors may miss this issue because they:

  • Lack experience with APL: Juniors may not have experience with APL and may not understand how it works with speech.
  • Don’t test thoroughly: Juniors may not test their skills thoroughly, which can cause them to miss this issue.
  • Don’t consider edge cases: Juniors may not consider edge cases, such as long speech, which can cause this issue to occur.

Leave a Comment