GL_ARB_fragment_program_shadow support (easy).
It looks like the hardware isn't handling DepthMode like it's supposed to given the texture formats available, which would mean you need to put swizzling into the fragment shader to do the job. This bug is tested by the piglit tests for the extension.
Enable Y tiling for textures (easy).
It should improve performance in general, just needs a piglit run.
GL_ARB_fragment_coord_conventions (moderate).
wine likes this extension since it gives them basically DirectX modes for gl_FragCoord. Since gl_FragCoord is generated by our software tnl, it's definitely supportable.
Debug failing compressed texture modes (moderate).
piglit fbo-generatemipmap-formats tests for compressed textures have some bugs, and some of them go away on the second render. I'm not sure what's going on here.
Debug draw-elements-base-vertex failure (hard).
The tnl/ code has bugs in this area, and it would be great if they were fixed.
Add GL_ARB_draw_instanced support (moderate).
This is all software code to be written under tnl/
Add additional vertex formats (easy).
These are all done in software code in tnl/, and there are piglit tests for them. The extensions that could be added are: GL_ARB_half_float_vertex, Add GL_ARB_vertex_type_2_10_10_10_rev support, and GL_EXT_vertex_array_bgra.
Add GL_ARB_instanced_arrays support (moderate).
This is all software code to be written under tnl/
Add GL_NV_primitive_restart support (easy).
There's a helper function in vbo/ that breaks primitive restart calls into a series of draw calls for you.
Support HW's early depth test correctly (moderate).
You have to make sure you get an MI_FLUSH with render cache flush in between enabling/disbling CLASSIC_EARLY_DEPTH, and only enable it when depth writes are enabled, the depth buffer is tiled and color buffer format is not COLORBUFFER_8BIT.
Fix gl_ClipVertex (hard).
See piglit glsl-1.20/execution/clipping.
Fix smooth/flat interpolation (moderate).
See piglit glsl-1.10/execution/interpolation.
Avoid no-op updates of non-pipelined state (moderate)
Calling DrawBuffer to the same buffer is painful as it flags us for updating the draw buffer, which is non-pipelined state. This brings the meta clear code from 200mb/s to 6mb/s. Something like what 965 does would be better for state tracking in this driver.