Nouveau TODO tasks

Trello board

The Nouveau team now has a trello board with all the current tasks. Let's hope it will contain more immediate tasks than what is available in this list and let's hope new comers will find it more enjoyable!

DRM tasks

  • [Junior, added 19.3.2012] Provide patches to allow Nouveau to compile on old kernels. The desired goal would be to support the latest stable kernel released. The ultimate goal here would be to successfully compile Nouveau on linux-lts (3.0) and the latest Ubuntu/Fedora's kernel. - Martin Peres
  • [Junior, added 26.4.2011] Print a single debug message if KMS is disabled. Currently, if KMS is disabled via drm or nouveau, Nouveau will successfully load and silently do nothing, which leads to confusion. - PekkaPaalanen
  • [Junior, updated 23.2.2010] Profiling: Use the kernel tracing infrastructure (latency tracer, irqsoff tracer, http://www.latencytop.org/ , etc.) to find out if Nouveau makes big latencies or disables interrupts for a long time. This might be a nice entry level task to get to know the DRM code.
  • [Portability] write a portable version of pci_read_config_dword, use it in nouveau_mem.c
  • write a test for register endianness. Some bioses (notably the mac's openfirmware) setup the nvidia regs as big endian. Ideally, we should read a known reg, see if it is backwards and act accordingly from that information.
  • fix the page size mapping problem on G5/64 bit for exposing fifo regs
  • Look for files with missing license information and fix them. Like nv20_graph.c

PM tasks

Energy consumption is something that should be taken care in various places. Here is a list of what needs to be researched on (power consumption-wise) and addressed as needed:

  • DVFS
  • PCIE ASPM
  • Disable engines on the fly when they are not needed (including xfer)
  • Automatic declock (hw-based)
  • PCIE lane dynamic adjustement
  • DAC declocking
  • Automatic vsync (something like, when FPS > 60, sync to vblank automatically)

When we have a clear view of the advantages and the drawbacks of each techniques along with how fast we can change their behaviour, we will need to base our decision to activate/deactivate each feature dynamically. To do so, we need to base the logic upon:

  • energy source (sector vs battery)
  • PCOUNTER (knowing all the available signals would be great :) We also need to map them for all chipsets :( )
  • temperature
  • battery charge (that could be fun)

Finally, we will need to introduce different profiles so as the user can express his needs:

  • Benchmark: Everything optimized for performance but save power when the user obviously isn't using his computer
  • Normal: Should be pretty close to benchmark except that it should try to save power even when in full-use when the user wouldn't experience the difference
  • Economy: Try to be conservative and upclock/activate features only when clearly needed
  • User: let users write their own "PM scheduler" and plug it to Nouveau (may not be possible)

Then, we need to know what can be offloaded to PDAEMON and what needs to be running on the HOST. The more is done on PDAEMON, the better since we can react way faster from the hw.

Video decoding

  • Decode two frames at once with VP2 (similar to the VP3+ impl) to hopefully improve a/v sync jitter in high-motion videos
  • Get field_pic_flag=1 H.264 videos to decode properly on VP2. There's some reference frame mismanagement (maybe) going on.
  • On VP2, MPEG2 over vdpau shows artifacts, but xvmc works fine. Figure out what the issue is. (Probably related to skipped macroblocks somehow... but how?)
  • On VP2, MPEG1 over vdpau shows blocking, as if the IDCT is done with incorrect parameters. The blob shows the same artifacts. XvMC looks perfect. Probably an error in the oddification/something else.
  • Add VDPAU MPEG2 support to VPE2 cards ([NV31:NV84]). Can look at how VP2 does the bitstream parsing bits/IQ. It's not quite that simple though, since doing the naive hookup doesn't work (skipped mb's, probably other issues... XvMC presents much cleaner data).

DDX tasks

  • Re-add hw video overlay support using the (newly added in 3.13) drm planes for [NV04:NV40].
  • [Junior, added 26.4.2011] Review and improve returned error codes in libdrm_nouveau, so that the DDX can say something more specific than just "error creating device" or "error opening device". Could say e.g. "incompatible kernel driver version". - PekkaPaalanen
  • [Junior] Cleanup the remaining obfuscated parts of the code. In particular, reinstate register names instead of hardcoded constants, also try to reinsert relevant comments or unobfuscated parts from http://cvsweb.xfree86.org/cvsweb/xc/programs/Xserver/hw/xfree86/vga256/drivers/nv/?hideattic=0&only_with_tag=xf-3_3_3
  • optimize core EXA to generate less requests that nv10/20/30 can't handle (namely repeats of 1*N or N*1 surfaces can have their "1" direction replaced with a stretch and N replaced with a loop. those are highly used to draw window borders).

DRI / Mesa tasks

  • improve the shader code generator (add an instruction scheduling pass or other optimizations)
  • pick an OpenGL application/game and make it run faster (or more reliably)
  • Look at a piglit test failure and try to fix it. See piglit test results.

General tasks

Reverse-engineering tasks

  • [Junior] Figure out how depth handling is different with a 16bit versus 24bit depth buffer. In particular, if depth clear values have to be changed and so on...
  • [Junior] Figure out how color buffer handling is different with a 16bit versus 32bit colors.
  • Figure out the commands emitted on NV10, NV20, NV30 and NV40 by the following enables : GL_AUTO_NORMAL, GL_COLOR_MATERIAL, GL_COLOR_SUM_EXT, GL_CONVOLUTION_1D, GL_INDEX_LOGIC_OP, GL_LINE_SMOOTH, GL_LINE_STIPPLE,GL_MAP1_COLOR_4, GL_MAP1_INDEX, GL_MAP1_NORMAL, GL_MAP1_TEXTURE_COORD_1, GL_MAP1_TEXTURE_COORD_2, GL_MAP1_TEXTURE_COORD_3, GL_MAP1_TEXTURE_COORD_4, GL_MAP1_VERTEX_3, GL_MAP1_VERTEX_4, GL_MAP2_COLOR_4, GL_MAP2_INDEX, GL_MAP2_NORMAL, GL_MAP2_TEXTURE_COORD_1, GL_MAP2_TEXTURE_COORD_2, GL_MAP2_TEXTURE_COORD_3, GL_MAP2_TEXTURE_COORD_4, GL_MAP2_VERTEX_3, GL_MAP2_VERTEX_4. refer ?GL also figure out if those enables are useful to any real OpenGL application.
  • unify the NV10_TCL, NV20_TCL and NV30_TCL offset names. Right now there are offsets for similar commands that bear different names.
  • [Junior with OpenGL knowledge] Write new tests for unexplored extensions.
  • [Junior] merge the xbox commands found in inner.txt into the NV20_TCL_PRIMITIVE object

Other items

  • [junior] Write a VRAM memtest tool. Map MMIO and use the PRAMIN (or PMEM) aperture to access the non-mappable VRAM, too. Make it a standalone tool, that does not use DRM, and restores the memory contents. Should be executed with GPU halted, i.e. in vga text mode or vesafb.