What's New in iOS 9 and OS X 10.11
This chapter summarizes the new features introduced in iOS 9 and OS X 10.11.
Feature Sets
All devices that support Metal conform to a feature set value described inListing 10-1.
Listing 10-1 Metal Feature Sets
typedef NS_ENUM(NSUInteger, MTLFeatureSet) |
---|
{ |
MTLFeatureSet_iOS_GPUFamily1_v1 = 0, |
MTLFeatureSet_iOS_GPUFamily2_v1 = 1, |
MTLFeatureSet_iOS_GPUFamily1_v2 = 2, |
MTLFeatureSet_iOS_GPUFamily2_v2 = 3, |
MTLFeatureSet_iOS_GPUFamily3_v1 = 4, |
MTLFeatureSet_OSX_GPUFamily1_v1 = 10000 |
}; |
All OS X devices that support Metal support theOSX_GPUFamily1_v1
feature set.
iOS devices that support Metal support a feature set determined by their GPU and OS versions. See theMTLFeatureSet
reference andiOS Device Compatibility Referencefor more information.
To find out which feature set is supported by a device, query thesupportsFeatureSet:
method of aMTLDevice
object.
Device Selection
Use theMTLCreateSystemDefaultDevice
function to obtain the preferred GPU for your app.OSX_GPUFamily1_v1
feature set devices may have multiple GPUs, which you can obtain by calling theMTLCopyAllDevices
function. To further obtain the characteristics of a single GPU in a multi-GPU system, query theheadless
property to find out if the GPU is attached to a display and query thelowPower
property to find the lower-power GPU in an automatic graphics switching system.
You can query specific render and compute characteristics with the newsupportsTextureSampleCount:
method andmaxThreadsPerThreadgroup
property added to theMTLDevice
protocol.
Resource Storage Modes and Device Memory Models
TheOSX_GPUFamily1_v1
feature set includes support for managing resources on GPUs that contain discrete memory. Resource memory allocation is explicitly handled by selecting an appropriate storage mode value from theListing 10-2enum for textures andListing 10-3enum for buffers.
Listing 10-2 Texture Storage Modes
typedef NS_ENUM(NSUInteger, MTLStorageMode) |
---|
{ |
MTLStorageModeShared = 0, |
MTLStorageModeManaged = 1, |
MTLStorageModePrivate = 2, |
}; |
Listing 10-3 Buffer Storage Modes
#define MTLResourceStorageModeShift 4 |
---|
typedef NS_ENUM(NSUInteger, MTLResourceOptions) |
{ |
MTLResourceStorageModeShared = MTLStorageModeShared << MTLResourceStorageModeShift, |
MTLResourceStorageModeManaged = MTLStorageModeManaged << MTLResourceStorageModeShift, |
MTLResourceStorageModePrivate = MTLStorageModePrivate << MTLResourceStorageModeShift, |
}; |
iOS feature sets only support shared and private storage modes. A description of the three new resource storage modes is summarized in the following sub-sections.
Shared
Resources allocated with the shared storage mode are stored in memory that is accessible to both the CPU and the GPU. On devices with discrete memory, these resources are accessed directly from CPU local memory rather than being copied to GPU memory.
In iOS feature sets, this is the default storage mode for textures. In theOSX_GPUFamily1_v1
feature set, textures cannot be allocated with the shared storage mode.
Private
Resources allocated with the private storage mode are stored in memory that is only accessible to the GPU.
Buffers and textures allocated with private storage mode can only be accessed by the GPU. Thecontents
property of a buffer returnsNULL
and theMTLTexture
methods listed below are illegal to call:
replaceRegion:mipmapLevel:slice:withBytes:bytesPerRow:bytesPerImage:
getBytes:bytesPerRow:bytesPerImage:fromRegion:mipmapLevel:slice:
To access a private resource, your app may perform one or more of the following actions:
Blit to or from the private resource.
Read from the private resource from any shader function.
Render to the private resource from a fragment shader.
Write to the private resource from a compute function.
Managed
On GPUs without discrete memory, managed resources have only a single memory allocation accessible to both the CPU and GPU. On GPUs with discrete memory, managed resources internally allocate both CPU-accessible and GPU-accessible memory.
Managed textures are not available in iOS feature sets; useMTLStorageModeShared
instead.
Any time your app uses the CPU to directly modify the contents of a buffer, you must make a call to thedidModifyRange:
method to notify Metal that you have changed the contents of the resource indicated by the specific range.
If you use the GPU to modify the contents of a managed resource and you wish to access the results with the CPU, then you must first synchronize the resource using thesynchronizeResource:
method orsynchronizeTexture:slice:level:
method.
Choosing a Resource Storage Mode
Generally, there are four main scenarios to consider when choosing a storage mode:
Choose
MTLResourceStorageModePrivate
for resources that are only ever read from and/or written to by the GPU. For example—render target textures (a particularly common and important case).Choose
MTLResourceStorageModeManaged
orMTLResourceStorageModePrivate
for resources that are initialized once and used many times in the future.Choose
MTLResourceStorageModeManaged
for resources that are populated per-frame by the CPU and then read by the GPU. For example—shader constant buffers or dynamic vertex data.Choose
MTLResourceStorageModeShared
for resources written to by the CPU and then blitted into GPU memory (or, for CPU reads from GPU memory). For example—staging buffers.
Setting and Querying a Resource Storage Mode
For textures, set the desired storage mode with thestorageMode
property of aMTLTextureDescriptor
object. The default storage mode for textures isMTLStorageModeShared
in iOS feature sets andMTLStorageModeManaged
in theOSX_GPUFamily1_v1
feature set.
For buffers, set the desired storage mode by passing in the respectiveMTLResourceOptions
value in any of thenewBufferWithLength:options:
,newBufferWithBytes:length:options:
, ornewBufferWithBytesNoCopy:length:options:deallocator:
methods ofMTLDevice
. The default storage mode for buffers isMTLStorageModeShared
, butOSX_GPUFamily1_v1
feature set apps may benefit from increased performance by explicitly managing their buffers with the managed or private storage modes.
The storage mode of a resource, for either a texture or a buffer, can be queried with thestorageMode
property of aMTLResource
object.
Textures
Hardware support for different texture types and pixel formats is a key capability difference between feature sets. This section lists the major texture additions to the framework; for a more detailed discussion, see the code listings and comparison tables in theMTLPixelFormat
reference andMetal Feature Set Tableschapter.
Compressed Textures
TheiOS_GPUFamily2_v1
,iOS_GPUFamily2_v2
, andiOS_GPUFamily3_v1
feature sets add support for ASTC textures. These pixel formats are listed inListing 10-4.
Listing 10-4 ASTC Pixel Formats
typedef NS_ENUM(NSUInteger, MTLPixelFormat) |
---|
{ |
// ASTC |
MTLPixelFormatASTC_4x4_sRGB = 186, |
MTLPixelFormatASTC_5x4_sRGB = 187, |
MTLPixelFormatASTC_5x5_sRGB = 188, |
MTLPixelFormatASTC_6x5_sRGB = 189, |
MTLPixelFormatASTC_6x6_sRGB = 190, |
MTLPixelFormatASTC_8x5_sRGB = 192, |
MTLPixelFormatASTC_8x6_sRGB = 193, |
MTLPixelFormatASTC_8x8_sRGB = 194, |
MTLPixelFormatASTC_10x5_sRGB = 195, |
MTLPixelFormatASTC_10x6_sRGB = 196, |
MTLPixelFormatASTC_10x8_sRGB = 197, |
MTLPixelFormatASTC_10x10_sRGB = 198, |
MTLPixelFormatASTC_12x10_sRGB = 199, |
MTLPixelFormatASTC_12x12_sRGB = 200, |
MTLPixelFormatASTC_4x4_LDR = 204, |
MTLPixelFormatASTC_5x4_LDR = 205, |
MTLPixelFormatASTC_5x5_LDR = 206, |
MTLPixelFormatASTC_6x5_LDR = 207, |
MTLPixelFormatASTC_6x6_LDR = 208, |
MTLPixelFormatASTC_8x5_LDR = 210, |
MTLPixelFormatASTC_8x6_LDR = 211, |
MTLPixelFormatASTC_8x8_LDR = 212, |
MTLPixelFormatASTC_10x5_LDR = 213, |
MTLPixelFormatASTC_10x6_LDR = 214, |
MTLPixelFormatASTC_10x8_LDR = 215, |
MTLPixelFormatASTC_10x10_LDR = 216, |
MTLPixelFormatASTC_12x10_LDR = 217, |
MTLPixelFormatASTC_12x12_LDR = 218, |
}; |
TheOSX_GPUFamily1_v1
feature set supports BC textures instead. The new pixel formats are listed inListing 10-5.
Listing 10-5 BC Pixel Formats
typedef NS_ENUM(NSUInteger, MTLPixelFormat) |
---|
{ |
// BC1, BC2, BC3 (aka S3TC/DXT) |
MTLPixelFormatBC1_RGBA = 130, |
MTLPixelFormatBC1_RGBA_sRGB = 131, |
MTLPixelFormatBC2_RGBA = 132, |
MTLPixelFormatBC2_RGBA_sRGB = 133, |
MTLPixelFormatBC3_RGBA = 134, |
MTLPixelFormatBC3_RGBA_sRGB = 135, |
// BC4, BC5 (aka RGTC) |
MTLPixelFormatBC4_RUnorm = 140, |
MTLPixelFormatBC4_RSnorm = 141, |
MTLPixelFormatBC5_RGUnorm = 142, |
MTLPixelFormatBC5_RGSnorm = 143, |
// BC6H, BC7 (aka BPTC) |
MTLPixelFormatBC6H_RGBFloat = 150, |
MTLPixelFormatBC6H_RGBUfloat = 151, |
MTLPixelFormatBC7_RGBAUnorm = 152, |
MTLPixelFormatBC7_RGBAUnorm_sRGB = 153, |
}; |
PVRTC Blit Operations
TheMTLBlitCommandEncoder
protocol contains a newMTLBlitOption
value that enables copying to or from a texture with a PVRTC pixel format. Two new methods support these operations by passing theMTLBlitOptionRowLinearPVRTC
value into theiroptions
parameter:
To copy PVRTC data from a buffer to a texture, use
copyFromBuffer:sourceOffset:sourceBytesPerRow:sourceBytesPerImage:sourceSize:toTexture:destinationSlice:destinationLevel:destinationOrigin:options:
To copy PVRTC data from a texture to a buffer, use
copyFromTexture:sourceSlice:sourceLevel:sourceOrigin:sourceSize:toBuffer:destinationOffset:destinationBytesPerRow:destinationBytesPerImage:options:
PVRTC blocks are arranged linearly in memory in row-major order, similar to all other compressed texture formats. PVRTC pixel formats are only available in iOS feature sets.
Depth/Stencil Render Targets
TheOSX_GPUFamily1_v1
feature set does not support separate depth and stencil render targets. If these render targets are needed, use one of the newly introduced depth/stencil pixel formats to set the same texture as both the depth and stencil render target. The combined depth/stencil pixel formats are listed inListing 10-6.
Listing 10-6 Depth/Stencil Pixel Formats
typedef NS_ENUM(NSUInteger, MTLPixelFormat) |
---|
{ |
// Depth/Stencil |
MTLPixelFormatDepth24Unorm_Stencil8 = 255, |
MTLPixelFormatDepth32Float_Stencil8 = 260, |
}; |
All feature sets support theMTLPixelFormatDepth32Float_Stencil8
pixel format. Only some devices that support theOSX_GPUFamily1_v1
feature set also support theMTLPixelFormatDepth24Unorm_Stencil8
pixel format. Query thedepth24Stencil8PixelFormatSupported
property of aMTLDevice
object to determine whether the pixel format is supported or not.
Textures with a depth, stencil, or depth/stencil pixel format can only be allocated with the private storage mode. To load or save the contents of these textures, you must perform a blit operation. TheMTLBlitCommandEncoder
protocol contains the newMTLBlitOption
values that define the behavior of a blit operation for textures with a depth, stencil, or depth/stencil pixel format:
Use the
MTLBlitOptionNone
value to blit the contents of a texture with a depth-only or stencil-only pixel format.Use the
MTLBlitOptionDepthFromDepthStencil
value to blit the depth portion of a texture with a combined depth/stencil pixel format.Use the
MTLBlitOptionStencilFromDepthStencil
value to blit the stencil portion of a texture with a combined depth/stencil pixel format.
Two new methods support these operations with theiroptions
parameter:
To copy depth or stencil data from a buffer to a texture, use
copyFromBuffer:sourceOffset:sourceBytesPerRow:sourceBytesPerImage:sourceSize:toTexture:destinationSlice:destinationLevel:destinationOrigin:options:
To copy depth or stencil data from a texture to a buffer, use
copyFromTexture:sourceSlice:sourceLevel:sourceOrigin:sourceSize:toBuffer:destinationOffset:destinationBytesPerRow:destinationBytesPerImage:options:
Note: To blit the contents of a texture with a depth-only or stencil-only pixel format, you may forgo the methods that take aMTLBlitOption
value and instead use thecopyFromBuffer:sourceOffset:sourceBytesPerRow:sourceBytesPerImage:sourceSize:toTexture:destinationSlice:destinationLevel:destinationOrigin:
andcopyFromTexture:sourceSlice:sourceLevel:sourceOrigin:sourceSize:toBuffer:destinationOffset:destinationBytesPerRow:destinationBytesPerImage:
methods.
Cube Array Textures
TheOSX_GPUFamily1_v1
feature set adds support for cube array textures with theMTLTextureTypeCubeArray
texture type value. Similarly, support for this texture type has been added to the Metal shading language with thetexturecube_array
type. The maximum length of a cube array texture is341
(2048 divided by 6 cube faces).
Texture Usage
TheMTLTextureDescriptor
class contains the newusage
property that allows you to declare how a texture will be used in your app. MultipleMTLTextureUsage
values may be combined with a bitwise OR (|
) if the texture will serve multiple uses over its lifetime. AMTLTexture
object can only be used in the ways specified by itsusage
value(s) (an error will occur otherwise). TheMTLTextureUsage
options and their intended use are described as follows:
MTLTextureUsageShaderRead
enables loading or sampling from the texture in any shader stage.MTLTextureUsageShaderWrite
enables writing to the texture from compute shaders.MTLTextureUsageRenderTarget
enables using this texture as a color, depth, or stencil render target in a render pass descriptor.MTLTextureUsagePixelFormatView
indicates that the texture will be used to create a new texture with thenewTextureViewWithPixelFormat:
ornewTextureViewWithPixelFormat:textureType:levels:slices:
methods.
Specifying and adhering to an appropriate texture usage allows Metal to optimize GPU operations for a given texture. For example—set the descriptor’susage
value toMTLTextureUsageRenderTarget
if you intend to use the resulting texture as a render target. This may significantly improve your app’s performance (with certain hardware).
If you don’t know what a texture will be used for, set the descriptor’susage
value toMTLTextureUsageUnknown
. This value allows your newly created texture to be used everywhere, but Metal will not be able to optimize its use in your app.
Important: The default usage value in theOSX_GPUFamily1_v1
feature set isMTLTextureUsageShaderRead
. In the iOS feature sets the default value isMTLTextureUsageUnknown
, but you should always aim to determine and set specific texture usages. Do not rely onMTLTextureUsageUnknown
for the best performance.
Listing 10-7shows you how to create a texture with multiple uses.
Listing 10-7 Specifying a Texture’s Usage
MTLTextureDescriptor* textureDescriptor = [[MTLTextureDescriptor alloc] init]; | |
---|---|
textureDescriptor.usage = MTLTextureUsageRenderTarget \ | MTLTextureUsageShaderRead; |
// set additional properties | |
id <MTLTexture> texture = [self.device newTextureWithDescriptor:textureDescriptor]; | |
// use the texture in a color render target | |
// sample from the texture in a fragment shader |
Detailed Texture Views
TheMTLTexture
protocol adds the extendednewTextureViewWithPixelFormat:textureType:levels:slices:
method that allows you to specify a new texture type, base level range, and base slice range for a new texture view (in addition to the pixel format parameter that was already supported by thenewTextureViewWithPixelFormat:
method). Textures created with these texture view methods can now query their parent texture’s attributes with theparentTexture
,parentRelativeLevel
, andparentRelativeSlice
properties (in addition to querying other attributes already supported by thepixelFormat
andtextureType
properties). For details on texture view creation restrictions, such as valid casting targets, seeMTLTexture Protocol Reference.
Note: Similar to querying parent texture attributes, a new texture created from a buffer with thenewTextureWithDescriptor:offset:bytesPerRow:
method can now query its source buffer attributes with thebuffer
,bufferOffset
, andbufferBytesPerRow
properties.
IOSurface Support
TheOSX_GPUFamily1_v1
feature set adds support for IOSurfaces. Use thenewTextureWithDescriptor:iosurface:plane:
method to create a new texture from an existing IOSurface.
Render Additions
Render Command Encoder
There are several graphics API additions to theMTLRenderCommandEncoder
class. The main new features are summarized below:
The method
setStencilFrontReferenceValue:backReferenceValue:
allows front-facing and back-facing primitives to use different stencil test reference values.Depth clipping is supported with the
MTLDepthClipMode
enum andsetDepthClipMode:
method.Counting occlusion query is supported in the
OSX_GPUFamily1_v1
andiOS_GPUFamily3_v1
feature sets with the newMTLVisibilityResultModeCounting
value that can be passed into thesetVisibilityResultMode:offset:
method.Texture barriers are supported in the
OSX_GPUFamily1_v1
feature set by calling thetextureBarrier
method between same-texture write and read operations.Base vertex and base instance values are supported in the
OSX_GPUFamily1_v1
andiOS_GPUFamily3_v1
feature sets with thedrawPrimitives:vertexStart:vertexCount:instanceCount:baseInstance:
anddrawIndexedPrimitives:indexCount:indexType:indexBuffer:indexBufferOffset:instanceCount:baseVertex:baseInstance:
methods. Similarly, support for these drawing inputs has been added to the Metal shading language by providing the new[[ base_vertex ]]
and[[ base_instance ]]
vertex shader inputs.Indirect drawing is supported in the
OSX_GPUFamily1_v1
andiOS_GPUFamily3_v1
feature sets with the argument structures listed inListing 10-8. Use these structs alongside thedrawPrimitives:indirectBuffer:indirectBufferOffset:
anddrawIndexedPrimitives:indexType:indexBuffer:indexBufferOffset:indirectBuffer:indirectBufferOffset:
methods, respectively.
Listing 10-8 Indirect Drawing Argument Structures
// Without an index list |
---|
typedef struct { |
uint32_t vertexCount; |
uint32_t instanceCount; |
uint32_t vertexStart; |
uint32_t baseInstance; |
} MTLDrawPrimitivesIndirectArguments; |
// With an index list |
typedef struct { |
uint32_t indexCount; |
uint32_t instanceCount; |
uint32_t indexStart; |
int32_t baseVertex; |
uint32_t baseInstance; |
} MTLDrawIndexedPrimitivesIndirectArguments; |
- Shader constant updates can now be performed more efficiently by setting a vertex buffer once and then simply updating its offset inside your draw loop, as shown inListing 10-9. If your app has a very small amount of constant data (tens of bytes) you can instead use the
setVertexBytes:length:atIndex:
method so the Metal framework can manage the constant buffer for you.
Listing 10-9 Shader Constant Updates
id <MTLBuffer> constant_buffer = // initialize buffer |
---|
MyConstants* constant_ptr = constant_buffer.contents; |
[renderpass setVertexBuffer:constant_buffer offset:0 atIndex:0]; |
for (i=0; i<draw_count; i++) |
{ |
constant_ptr[i] = // write constants directly into the buffer |
[renderpass setVertexBufferOffset:i*sizeof(MyConstants) atIndex:0]; |
// draw |
} |
Layered Rendering
TheOSX_GPUFamily1_v1
feature set adds support for layered rendering with new APIs in theMTLRenderPassDescriptor
andMTLRenderPipelineDescriptor
classes. Layered rendering enables a vertex shader to render each primitive to a layer of a texture array, cube texture, or 3D texture, specified by the vertex shader of its first vertex. For a 2D texture array or a cube texture, each slice is layer; for a 3D texture, each depth plane of pixels is a layer. Load and store actions apply to every layer of the render target.
To enable layered rendering, you must configure your render pass and render pipeline descriptors appropriately:
Set the value of the
renderTargetArrayLength
property to specify the minimum number of layers available across all render targets. For example—set this value to6
if you have a 2D texture array and a cube texture, each with a minimum of 6 layers.Set the value of the
inputPrimitiveTopology
property to specify the primitive type being rendered. For example—set this value toMTLPrimitiveTopologyClassTriangle
for cube-based shadow mapping. The full enum declaration containing all primitive type values is listed inListing 10-10Additionally, the value of
sampleCount
must be 1 (this is the default value).
Listing 10-10 Primitive Topology Values
typedef NS_ENUM(NSUInteger, MTLPrimitiveTopologyClass) |
---|
{ |
MTLPrimitiveTopologyClassUnspecified = 0, |
MTLPrimitiveTopologyClassPoint = 1, |
MTLPrimitiveTopologyClassLine = 2, |
MTLPrimitiveTopologyClassTriangle = 3 |
}; |
Compute Additions
There are also a few compute API additions to both existing and new classes, as summarized below:
- Indirect processing is supported in the
OSX_GPUFamily1_v1
andiOS_GPUFamily3_v1
feature sets with the argument structure listed inListing 10-11. Use this struct alongside thedispatchThreadgroupsWithIndirectBuffer:indirectBufferOffset:threadsPerThreadgroup:
method.
Listing 10-11 Indirect Processing Argument Structures
typedef struct { |
---|
uint32_t threadgroupsPerGrid[3]; |
} MTLDispatchThreadgroupsIndirectArguments; |
- The new
MTLComputePipelineDescriptor
class specifies the compute configuration state used during a compute operation pass. This descriptor object is used to create aMTLComputePipelineState
object. This new class is the compute counterpart to the previously-introducedMTLRenderPipelineDescriptor
class.
Supporting Frameworks
Metal provides two new frameworks that help you build Metal apps in a much easier and more powerful way.
MetalKit
The MetalKit framework provides a set of utility functions and classes that reduce the effort required to create a Metal app. MetalKit provides development support for three key areas:
Texture loading helps your app easily and asynchronously load textures from a variety of sources. Common file formats such as PNG and JPEG are supported, as well as texture-specific formats such as KTX and PVR.
Model handling provides Metal-specific functionality that makes it easy to interface with Model I/O assets. Use these highly-optimized functions and objects to transfer data efficiently between Model I/O meshes and Metal buffers.
View management provides a standard implementation of a Metal view that drastically reduces the amount of code needed to create a graphics-rendering app.
The MetalKit framework is available in all Metal feature sets. To learn more about the MetalKit APIs, seeMetalKit Framework Reference.
Metal Performance Shaders
The Metal Performance Shaders framework provides highly-optimized compute and graphics shaders that are designed to integrate easily and efficiently into your Metal app.
Use the Metal Performance Shader classes to achieve optimal performance for all supported devices, without having to target or update your shader code for specific GPU families. Metal Performance Shader objects fit seamlessly into your Metal app and can be used with resource objects such as buffers and textures.
Common shaders provided by the Metal Performance Shader framework include:
Gaussian blur.
Image histogram.
Sobel edge detection.
The Metal Performance Shaders framework is available in theiOS_GPUFamily2_v1
,iOS_GPUFamily2_v2
, andiOS_GPUFamily3_v1
feature sets. To learn more about the Metal Performance Shaders APIs, seeMetal Performance Shaders Framework Reference.