Microsoft introduces a WinUI3-based audio editor app that uses local AI models to demonstrate AI integration to developers.
Microsoft has released a sample project to show software developers how to integrate local AI models into Windows applications. The app, called Audio Editor, which is based on WinUI3 and the Windows App SDK, demonstrates the capabilities of on-device AI processing.
The main feature of the app is smart trimming of audio files. Users can upload an audio file, specify a topic or keyword, and set a desired trimming duration. The app will then generate a trimmed audio clip containing the most relevant section related to the specified topic.
To enable this functionality, the app uses three different ONNX models:
- Silero Voice Activity Detection (VAD): This model divides the audio file into smaller sections that can be processed by other models.
- Whisper Tiny: A speech recognition model that transcribes audio segments into text.
- MiniLM: A text embedding model that calculates the semantic similarity between the topic and the transcribed text segments.
The combination of these models allows the app to identify and extract the most relevant audio section.
Microsoft provides the application’s full source code on GitHub, along with detailed setup instructions and a code walkthrough. This should help developers better understand the implementation and integrate similar functionality into their own applications.
Source: www.com-magazin.de