This is a collection of random changes coming from the workload I'm working on. The most important change is better support for `.params`: PTX uses .param namespace both for some local variables and kernel args. This is a problem for us because those are different address spaces on AMDGPU. So far we've made an effort to convert to local and const namespaces whenever possible, but this commit tries to handle more patterns, which are impossible to track precisely, by converting to generic space.
Previously if we ran into a broken instruction we'd fail whole compilation. This PR changes it so (only in Release mode) we try and progress at all cost. Meaning that if we had trouble parsing an instruction we just remove function form the output and continue.
For some workloads we can still compile a semi-broken, but meaningful subset of a module