-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
feat(macOS): Capture audio on macOS using Tap API #4209
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from 16 commits
5bd8e54
74c2cf1
0c2e096
48ef28d
05c76c1
d594c06
2b0ba99
1dc8217
1a3bc52
cab7da6
a729f78
b29445e
9762651
13bc467
5aa03e2
a847d1f
6404705
faa1170
fc3609b
b056036
2c40470
1ba9208
44407a4
3ce6572
fe2ef3f
186c21c
db3d2df
f667865
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -56,3 +56,6 @@ package-lock.json | |
| # Python | ||
| *.pyc | ||
| venv/ | ||
|
|
||
|
|
||
| .cache/ | ||
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -825,6 +825,31 @@ editing the `conf` file in a text editor. Use the examples as reference. | |||||||||||||||||||
| </tr> | ||||||||||||||||||||
| </table> | ||||||||||||||||||||
|
|
||||||||||||||||||||
| ### macos_system_wide_audio_tap | ||||||||||||||||||||
|
|
||||||||||||||||||||
| <table> | ||||||||||||||||||||
| <tr> | ||||||||||||||||||||
| <td>Description</td> | ||||||||||||||||||||
| <td colspan="2"> | ||||||||||||||||||||
| @tip{Overrides Audio Sink settings.} | ||||||||||||||||||||
| Toggles the creation of a system-wide audio tap that captures outgoing audio from all processes. | ||||||||||||||||||||
| This tap can act as an input in a HAL aggregate device, like a virtual microphone. | ||||||||||||||||||||
| @note{Requirement: macOS 14.2 or later.} | ||||||||||||||||||||
| @attention{macOS Privacy Settings: The user must add Terminal or Sunshine to <strong>Privacy & Security > Screen & System Audio Recording > System Audio Recording Only</strong> in System Settings.} | ||||||||||||||||||||
| </td> | ||||||||||||||||||||
| </tr> | ||||||||||||||||||||
| <tr> | ||||||||||||||||||||
| <td>Default</td> | ||||||||||||||||||||
| <td colspan="2">disabled</td> | ||||||||||||||||||||
| </tr> | ||||||||||||||||||||
| <tr> | ||||||||||||||||||||
| <td>Example</td> | ||||||||||||||||||||
| <td colspan="2">@code{} | ||||||||||||||||||||
| macos_system_wide_audio_tap = disabled | ||||||||||||||||||||
| @endcode</td> | ||||||||||||||||||||
| </tr> | ||||||||||||||||||||
| </table> | ||||||||||||||||||||
|
|
||||||||||||||||||||
|
||||||||||||||||||||
| Sunshine can only access microphones on macOS due to system limitations. | |
| To stream system audio use "Soundflower" or "BlackHole". |
Sunshine/docs/getting_started.md
Lines 376 to 378 in 705d763
| Sunshine can only access microphones on macOS due to system limitations. To stream system audio use | |
| [Soundflower](https://github.com/mattingalls/Soundflower) or | |
| [BlackHole](https://github.com/ExistentialAudio/BlackHole). |
Sunshine/src_assets/common/assets/web/configs/tabs/AudioVideo.vue
Lines 37 to 40 in 705d763
| <template #macos> | |
| <a href="https://github.com/mattingalls/Soundflower" target="_blank">Soundflower</a><br> | |
| <a href="https://github.com/ExistentialAudio/BlackHole" target="_blank">BlackHole</a>. | |
| </template> |
Maybe we should keep a little blurb somewhere about how to use these two if they don't want to use the Tap API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll remove that.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,29 +1,205 @@ | ||
| /** | ||
| * @file src/platform/macos/av_audio.h | ||
| * @brief Declarations for audio capture on macOS. | ||
| * @brief Declarations for macOS audio capture with dual input paths. | ||
| * | ||
| * This header defines the AVAudio class which provides two distinct audio capture methods: | ||
| * 1. **Microphone capture** - Uses AVFoundation framework to capture from specific microphone devices | ||
| * 2. **System-wide audio tap** - Uses Core Audio taps to capture all system audio output (macOS 14.2+) | ||
| * | ||
| * The system-wide audio tap allows capturing audio from all applications and system sounds, | ||
| * while microphone capture focuses on input from physical or virtual microphone devices. | ||
| */ | ||
| #pragma once | ||
|
|
||
| // platform includes | ||
| #import <AudioToolbox/AudioToolbox.h> | ||
| #import <AVFoundation/AVFoundation.h> | ||
| #import <CoreAudio/AudioHardwareTapping.h> | ||
| #import <CoreAudio/CoreAudio.h> | ||
|
|
||
| // lib includes | ||
| #include "third-party/TPCircularBuffer/TPCircularBuffer.h" | ||
|
|
||
| // Buffer length for audio processing | ||
| #define kBufferLength 4096 | ||
|
|
||
| NS_ASSUME_NONNULL_BEGIN | ||
|
|
||
| // Forward declarations | ||
| @class AVAudio; | ||
| @class CATapDescription; | ||
|
|
||
| /** | ||
| * @brief Data structure for AudioConverter input callback. | ||
| * Contains audio data and metadata needed for format conversion during audio processing. | ||
| */ | ||
| struct AudioConverterInputData { | ||
| float *inputData; ///< Pointer to input audio data | ||
| UInt32 inputFrames; ///< Total number of input frames available | ||
| UInt32 framesProvided; ///< Number of frames already provided to converter | ||
| UInt32 deviceChannels; ///< Number of channels in the device audio | ||
| AVAudio *avAudio; ///< Reference to the AVAudio instance | ||
| }; | ||
|
|
||
| /** | ||
| * @brief IOProc client data structure for Core Audio system taps. | ||
| * Contains configuration and conversion data for real-time audio processing. | ||
| */ | ||
| typedef struct { | ||
| AVAudio *avAudio; ///< Reference to AVAudio instance | ||
| UInt32 clientRequestedChannels; ///< Number of channels requested by client | ||
| UInt32 clientRequestedSampleRate; ///< Sample rate requested by client | ||
| UInt32 clientRequestedFrameSize; ///< Frame size requested by client | ||
| UInt32 aggregateDeviceSampleRate; ///< Sample rate of the aggregate device | ||
| UInt32 aggregateDeviceChannels; ///< Number of channels in aggregate device | ||
| AudioConverterRef _Nullable audioConverter; ///< Audio converter for format conversion | ||
| } AVAudioIOProcData; | ||
|
|
||
| /** | ||
| * @brief Core Audio capture class for macOS audio input and system-wide audio tapping. | ||
| * Provides functionality for both microphone capture via AVFoundation and system-wide | ||
| * audio capture via Core Audio taps (requires macOS 14.2+). | ||
| */ | ||
| @interface AVAudio: NSObject <AVCaptureAudioDataOutputSampleBufferDelegate> { | ||
| @public | ||
| TPCircularBuffer audioSampleBuffer; | ||
| TPCircularBuffer audioSampleBuffer; ///< Shared circular buffer for both audio capture paths | ||
| @private | ||
| // System-wide audio tap components (Core Audio) | ||
| AudioObjectID tapObjectID; ///< Core Audio tap object identifier for system audio capture | ||
| AudioObjectID aggregateDeviceID; ///< Aggregate device ID for system tap audio routing | ||
| AudioDeviceIOProcID ioProcID; ///< IOProc identifier for real-time audio processing | ||
| AVAudioIOProcData *_Nullable ioProcData; ///< Context data for IOProc callbacks and format conversion | ||
| } | ||
|
|
||
| @property (nonatomic, assign) AVCaptureSession *audioCaptureSession; | ||
| @property (nonatomic, assign) AVCaptureConnection *audioConnection; | ||
| @property (nonatomic, assign) NSCondition *samplesArrivedSignal; | ||
| // AVFoundation microphone capture properties | ||
| @property (nonatomic, assign, nullable) AVCaptureSession *audioCaptureSession; ///< AVFoundation capture session for microphone input | ||
| @property (nonatomic, assign, nullable) AVCaptureConnection *audioConnection; ///< Audio connection within the capture session | ||
|
|
||
| + (NSArray *)microphoneNames; | ||
| + (AVCaptureDevice *)findMicrophone:(NSString *)name; | ||
| // Shared synchronization property (used by both audio paths) | ||
| @property (nonatomic, assign, nullable) NSCondition *samplesArrivedSignal; ///< Condition variable to signal when audio samples are available | ||
|
|
||
| - (int)setupMicrophone:(AVCaptureDevice *)device sampleRate:(UInt32)sampleRate frameSize:(UInt32)frameSize channels:(UInt8)channels; | ||
| /** | ||
| * @brief Get all available microphone devices on the system. | ||
| * @return Array of AVCaptureDevice objects representing available microphones | ||
| */ | ||
| + (NSArray<AVCaptureDevice *> *)microphones; | ||
|
|
||
| /** | ||
| * @brief Get names of all available microphone devices. | ||
| * @return Array of NSString objects with microphone device names | ||
| */ | ||
| + (NSArray<NSString *> *)microphoneNames; | ||
|
|
||
| /** | ||
| * @brief Find a specific microphone device by name. | ||
| * @param name The name of the microphone to find (nullable - will return nil if name is nil) | ||
| * @return AVCaptureDevice object if found, nil otherwise | ||
| */ | ||
| + (nullable AVCaptureDevice *)findMicrophone:(nullable NSString *)name; | ||
|
|
||
| /** | ||
| * @brief Sets up microphone capture using AVFoundation framework. | ||
| * @param device The AVCaptureDevice to use for audio input (nullable - will return error if nil) | ||
| * @param sampleRate Target sample rate in Hz | ||
| * @param frameSize Number of frames per buffer | ||
| * @param channels Number of audio channels (1=mono, 2=stereo) | ||
| * @return 0 on success, -1 on failure | ||
| */ | ||
| - (int)setupMicrophone:(nullable AVCaptureDevice *)device sampleRate:(UInt32)sampleRate frameSize:(UInt32)frameSize channels:(UInt8)channels; | ||
|
|
||
| /** | ||
| * @brief Sets up system-wide audio tap for capturing all system audio. | ||
| * Requires macOS 14.2+ and appropriate permissions. | ||
| * @param sampleRate Target sample rate in Hz | ||
| * @param frameSize Number of frames per buffer | ||
| * @param channels Number of audio channels | ||
| * @return 0 on success, -1 on failure | ||
| */ | ||
| - (int)setupSystemTap:(UInt32)sampleRate frameSize:(UInt32)frameSize channels:(UInt8)channels; | ||
|
|
||
| // Buffer management methods for testing and internal use | ||
| /** | ||
| * @brief Initializes the circular audio buffer for the specified number of channels. | ||
| * @param channels Number of audio channels to configure the buffer for | ||
| */ | ||
| - (void)initializeAudioBuffer:(UInt8)channels; | ||
|
|
||
| /** | ||
| * @brief Cleans up and deallocates the audio buffer resources. | ||
| */ | ||
| - (void)cleanupAudioBuffer; | ||
|
|
||
| /** | ||
| * @brief Cleans up system tap resources in a safe, ordered manner. | ||
| * @param tapDescription Optional tap description object to release (can be nil) | ||
| */ | ||
| - (void)cleanupSystemTapContext:(nullable id)tapDescription; | ||
|
|
||
| /** | ||
| * @brief Initializes the system tap context with specified audio parameters. | ||
| * @param sampleRate Target sample rate in Hz | ||
| * @param frameSize Number of frames per buffer | ||
| * @param channels Number of audio channels | ||
| * @return 0 on success, -1 on failure | ||
| */ | ||
| - (int)initializeSystemTapContext:(UInt32)sampleRate frameSize:(UInt32)frameSize channels:(UInt8)channels; | ||
|
|
||
| /** | ||
| * @brief Creates a Core Audio tap description for system audio capture. | ||
| * @param channels Number of audio channels to configure the tap for | ||
| * @return CATapDescription object on success, nil on failure | ||
| */ | ||
| - (nullable CATapDescription *)createSystemTapDescriptionForChannels:(UInt8)channels; | ||
|
|
||
| /** | ||
| * @brief Creates an aggregate device with the specified tap description and audio parameters. | ||
| * @param tapDescription Core Audio tap description for system audio capture | ||
| * @param sampleRate Target sample rate in Hz | ||
| * @param frameSize Number of frames per buffer | ||
| * @return OSStatus indicating success (noErr) or error code | ||
| */ | ||
| - (OSStatus)createAggregateDeviceWithTapDescription:(CATapDescription *)tapDescription sampleRate:(UInt32)sampleRate frameSize:(UInt32)frameSize; | ||
|
|
||
| /** | ||
| * @brief Audio converter complex input callback for format conversion. | ||
| * Handles audio data conversion between different formats during system audio capture. | ||
| * @param inAudioConverter The audio converter reference | ||
| * @param ioNumberDataPackets Number of data packets to convert | ||
| * @param ioData Audio buffer list for converted data | ||
| * @param outDataPacketDescription Packet description for output data | ||
| * @param inputInfo Input data structure containing source audio | ||
| * @return OSStatus indicating success (noErr) or error code | ||
| */ | ||
| - (OSStatus)audioConverterComplexInputProc:(AudioConverterRef)inAudioConverter | ||
| ioNumberDataPackets:(UInt32 *)ioNumberDataPackets | ||
| ioData:(AudioBufferList *)ioData | ||
| outDataPacketDescription:(AudioStreamPacketDescription *_Nullable *_Nullable)outDataPacketDescription | ||
| inputInfo:(struct AudioConverterInputData *)inputInfo; | ||
|
|
||
| /** | ||
| * @brief Core Audio IOProc callback for processing system audio data. | ||
| * Handles real-time audio processing, format conversion, and writes to circular buffer. | ||
| * @param inDevice The audio device identifier | ||
| * @param inNow Current audio time stamp | ||
| * @param inInputData Input audio buffer list from the device | ||
| * @param inInputTime Time stamp for input data | ||
| * @param outOutputData Output audio buffer list (nullable for input-only devices) | ||
| * @param inOutputTime Time stamp for output data | ||
| * @param clientChannels Number of channels requested by client | ||
| * @param clientFrameSize Frame size requested by client | ||
| * @param clientSampleRate Sample rate requested by client | ||
| * @return OSStatus indicating success (noErr) or error code | ||
| */ | ||
| - (OSStatus)systemAudioIOProc:(AudioObjectID)inDevice | ||
| inNow:(const AudioTimeStamp *)inNow | ||
| inInputData:(const AudioBufferList *)inInputData | ||
| inInputTime:(const AudioTimeStamp *)inInputTime | ||
| outOutputData:(nullable AudioBufferList *)outOutputData | ||
| inOutputTime:(const AudioTimeStamp *)inOutputTime | ||
| clientChannels:(UInt32)clientChannels | ||
| clientFrameSize:(UInt32)clientFrameSize | ||
| clientSampleRate:(UInt32)clientSampleRate; | ||
|
|
||
| @end | ||
|
|
||
| NS_ASSUME_NONNULL_END |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason this needs to be a setting? The main point of confusion with users will be the fact that you don't link your capture to an audio device, the way you do on Windows. If we just defaulted to this, is there a downside?
Just FYI, when I was first playing around with the Tap API I went about it using the
version which used the default audio device. I never tried the system audio tap which I suppose works better. For example system tap captures audio even when the default audio device is muted, I think device capture would not capture audio in this case, but I haven't tested it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kept it as a setting to avoid breaking existing setups (users relying on BlackHole or similar). This way an update wouldn’t disrupt their workflow. That said, I'm happy to defer to you and the other maintainers on whether it makes sense to just default to the system tap.
Also, I’m not entirely sure
initStereoGlobalTapButExcludeProcessesis the best long-term choice, since it enforces stereo and might complicate 5.1/7.1 setups? I don’t have a great way to test multichannel properly, so I’d appreciate your input here.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without looking at the code super closely, I believe it should just use the Tap API if audio/virtual sink are unset. If those are set then use whatever they are set to. With this approach their setup will not change as they had to manually set blackhole or whatever.