Commit graph

494 commits

Author SHA1 Message Date
Andrzej Janik
ed1ea1f6de Trace magic pointer 2025-09-24 00:15:25 +00:00
Andrzej Janik
44823a2d75 Set modes 2025-09-23 21:46:22 +00:00
Andrzej Janik
0efbbd002e Write out dev pointers 2025-09-23 01:28:26 +00:00
Andrzej Janik
fd8cce1a7b Merge commit 'a66fddc0fa' into demo_mode2 2025-09-23 01:28:01 +00:00
Andrzej Janik
a66fddc0fa Fail linking on undefined 2025-09-23 01:27:53 +00:00
Andrzej Janik
0c4e103f8f Merge commit '07acc64d33' into demo_mode2 2025-09-22 21:18:18 +00:00
Andrzej Janik
07acc64d33 Allow skipping post-values 2025-09-22 21:18:01 +00:00
Andrzej Janik
3bad9852a5 Minor compiler improvements 2025-09-22 20:29:22 +00:00
Andrzej Janik
6cd976e06a Merge commit '18a2b765f7' into demo_mode2 2025-09-22 18:13:05 +00:00
Andrzej Janik
ed4275c0cb Merge commit 'f46b756fdc' into demo_mode2 2025-09-22 18:01:01 +00:00
Andrzej Janik
f46b756fdc
Fix cuCtxPopCurrent (#519)
Some checks failed
ZLUDA / Build (Linux) (push) Has been cancelled
ZLUDA / Build (Windows) (push) Has been cancelled
ZLUDA / Build AMD GPU unit tests (push) Has been cancelled
ZLUDA / Run AMD GPU unit tests (push) Has been cancelled
2025-09-20 22:45:46 -07:00
Andrzej Janik
18a2b765f7 Pass correct arguments 2025-09-20 01:54:52 +00:00
Andrzej Janik
47703e6507 Pass correct arguments 2025-09-20 01:54:40 +00:00
Andrzej Janik
d35e7b29d6 Merge commit '2b9c8946ec' into demo_mode2 2025-09-20 00:44:33 +00:00
Andrzej Janik
2b9c8946ec Add replayer 2025-09-20 00:43:29 +00:00
Andrzej Janik
644a22fd43 Merge commit '160048a293' into trace_debugger 2025-09-19 23:33:11 +00:00
Andrzej Janik
160048a293 Fix cuCtxPopCurrent 2025-09-19 23:30:29 +00:00
Violet
875ac13be2
Support lists of variables to be declared (#516)
Some checks failed
ZLUDA / Build (Linux) (push) Has been cancelled
ZLUDA / Build (Windows) (push) Has been cancelled
ZLUDA / Build AMD GPU unit tests (push) Has been cancelled
ZLUDA / Run AMD GPU unit tests (push) Has been cancelled
For example,

```
.reg .u32 a, b;
```
2025-09-19 13:36:48 -07:00
Andrzej Janik
f71cd95f71 demo mode 2025-09-19 18:17:08 +00:00
Violet
62ec652e7c
Disable virtual memory management (#515)
Some checks are pending
ZLUDA / Build (Linux) (push) Waiting to run
ZLUDA / Build (Windows) (push) Waiting to run
ZLUDA / Build AMD GPU unit tests (push) Waiting to run
ZLUDA / Run AMD GPU unit tests (push) Blocked by required conditions
We don't currently support it, so report it as unsupported.
2025-09-18 19:11:30 -07:00
Andrzej Janik
3289d92f59 Filter enqueues to be saved 2025-09-19 02:07:59 +00:00
Andrzej Janik
f3e143d8dd Save source ptx and save to the right path 2025-09-19 01:53:01 +00:00
Andrzej Janik
d880ee78b5 Fix some bugs 2025-09-19 00:58:42 +00:00
Andrzej Janik
bfef3317dc Start working on trace replay 2025-09-19 00:39:27 +00:00
Andrzej Janik
b5f41c7cd0
More runtime fixes, add mma instruction (#509)
Some checks are pending
ZLUDA / Build (Linux) (push) Waiting to run
ZLUDA / Build (Windows) (push) Waiting to run
ZLUDA / Build AMD GPU unit tests (push) Waiting to run
ZLUDA / Run AMD GPU unit tests (push) Blocked by required conditions
2025-09-18 11:15:22 -07:00
Violet
150ce171cf
Fix devcontainer (#514)
Some checks failed
ZLUDA / Build (Linux) (push) Has been cancelled
ZLUDA / Build (Windows) (push) Has been cancelled
ZLUDA / Build AMD GPU unit tests (push) Has been cancelled
ZLUDA / Run AMD GPU unit tests (push) Has been cancelled
2025-09-17 14:38:42 -07:00
Violet
571dad0972
Add support for cuBLASLt functions used by llm.c (#512) 2025-09-17 11:02:21 -07:00
Violet
5185138596
Create bindings for hipblasLt (#510)
Some checks are pending
ZLUDA / Build (Linux) (push) Waiting to run
ZLUDA / Build (Windows) (push) Waiting to run
ZLUDA / Build AMD GPU unit tests (push) Waiting to run
ZLUDA / Run AMD GPU unit tests (push) Blocked by required conditions
Generate bindings for hipblasLt and make some changes to the bindings for cublasLt. Notably, the `hip_type` `Option` is changed to a `Vec`, so that multiple `From` implementations (for `rocblas_error` and `hipblasLtError`) can be created for `cublasError_t`.
2025-09-16 16:23:15 -07:00
Violet
3afb8d39e7
Implement cuStreamCreate (#511) 2025-09-16 15:52:35 -07:00
Andrzej Janik
262c25c76e
Use LD_AUDIT instead of LD_PRELOAD (#508)
Some checks failed
ZLUDA / Build (Linux) (push) Has been cancelled
ZLUDA / Build (Windows) (push) Has been cancelled
ZLUDA / Build AMD GPU unit tests (push) Has been cancelled
ZLUDA / Run AMD GPU unit tests (push) Has been cancelled
LD_AUDIT gives us more control that LD_PRELOAD and I've observed it to work much better
2025-09-15 11:58:45 -07:00
Andrzej Janik
044fab47e5
Update devcontainer (#507)
Some checks are pending
ZLUDA / Build (Linux) (push) Waiting to run
ZLUDA / Build (Windows) (push) Waiting to run
ZLUDA / Build AMD GPU unit tests (push) Waiting to run
ZLUDA / Run AMD GPU unit tests (push) Blocked by required conditions
2025-09-14 19:03:14 -07:00
Violet
e3a99b7ee1
Fix min.ftz.nan.f16 for ROCm 6.3.4 (#506)
Some checks failed
ZLUDA / Build (Linux) (push) Has been cancelled
ZLUDA / Build (Windows) (push) Has been cancelled
ZLUDA / Build AMD GPU unit tests (push) Has been cancelled
ZLUDA / Run AMD GPU unit tests (push) Has been cancelled
This PR fixes compatibility issues with ROCm 6.3.4 by replacing the unavailable llvm.minimum.f16 intrinsic with an equivalent implementation using llvm.minnum.f16 combined with NaN handling logic. Same applies to max operations

This also adds `"amdgpu-ieee"="false"` attribute everywhere, which gives us better codegen for min/max everywhere
2025-09-12 16:48:47 -07:00
Andrzej Janik
6c811a55d2
Random fixes (#504)
This is a collection of random changes coming from the workload I'm working on. The most important change is better support for `.params`: PTX uses .param namespace both for some local variables and kernel args. This is a problem for us because those are different address spaces on AMDGPU. So far we've made an effort to convert to local and const namespaces whenever possible, but this commit tries to handle more patterns, which are impossible to track precisely, by converting to generic space.
2025-09-12 13:52:33 -07:00
Violet
7b5fdb30c4
Implement ldmatrix (#503)
Some checks failed
ZLUDA / Build (Linux) (push) Has been cancelled
ZLUDA / Build (Windows) (push) Has been cancelled
ZLUDA / Build AMD GPU unit tests (push) Has been cancelled
ZLUDA / Run AMD GPU unit tests (push) Has been cancelled
2025-09-09 19:31:56 -07:00
Violet
3da39364e0
Make blame ignore formatting commit (#502) 2025-09-09 13:12:31 -07:00
Violet
d81456a549
Add support for cvt_rn_bf16x2_f32 (#501)
Some checks are pending
ZLUDA / Build (Linux) (push) Waiting to run
ZLUDA / Build (Windows) (push) Waiting to run
ZLUDA / Build AMD GPU unit tests (push) Waiting to run
ZLUDA / Run AMD GPU unit tests (push) Blocked by required conditions
2025-09-08 17:41:24 -07:00
Violet
d342e1a06e
Implement redux.sync for u32 and s32 (#500) 2025-09-08 16:13:28 -07:00
Andrzej Janik
869d291099
Progress compilation despite parsing errors (#495)
Previously if we ran into a broken instruction we'd fail whole compilation. This PR changes it so (only in Release mode) we try and progress at all cost. Meaning that if we had trouble parsing an instruction we just remove function form the output and continue.

For some workloads we can still compile a semi-broken, but meaningful subset of a module
2025-09-08 14:35:29 -07:00
Violet
00ca92167d
Remove accidentally committed file (#499) 2025-09-08 13:18:11 -07:00
Violet
4306646739
Support immediates in vector operands (#488)
Some checks are pending
ZLUDA / Build (Linux) (push) Waiting to run
ZLUDA / Build (Windows) (push) Waiting to run
ZLUDA / Build AMD GPU unit tests (push) Waiting to run
ZLUDA / Run AMD GPU unit tests (push) Blocked by required conditions
2025-09-08 10:26:58 -07:00
Violet
b11c17a48b
Fix const_ident.ll (#497)
Minor fix to .ptx was not updated in .ll
2025-09-08 09:49:59 -07:00
Violet
e7f10afb51
Use Vec<RegOrImmediate> as const/global variable initializer (#490)
Some checks failed
ZLUDA / Build (Linux) (push) Has been cancelled
ZLUDA / Build (Windows) (push) Has been cancelled
ZLUDA / Build AMD GPU unit tests (push) Has been cancelled
ZLUDA / Run AMD GPU unit tests (push) Has been cancelled
2025-09-05 16:41:41 -07:00
Violet
b7f3a647d7
Implement fma.rn.fn.bf16x2 (#496)
Some checks are pending
ZLUDA / Build (Linux) (push) Waiting to run
ZLUDA / Build (Windows) (push) Waiting to run
ZLUDA / Build AMD GPU unit tests (push) Waiting to run
ZLUDA / Run AMD GPU unit tests (push) Blocked by required conditions
* Add fma bf16x2 test

* Implement fma.rn.fn.bf16x2

* cargo fmt
2025-09-04 17:29:20 -07:00
Andrzej Janik
5309065cc1
Enhance bug report template with ZLUDA version input (#494)
Some checks failed
ZLUDA / Build (Linux) (push) Has been cancelled
ZLUDA / Build (Windows) (push) Has been cancelled
ZLUDA / Build AMD GPU unit tests (push) Has been cancelled
ZLUDA / Run AMD GPU unit tests (push) Has been cancelled
Added version input field to bug report template with a note on version support.
2025-09-03 16:20:34 -07:00
Violet
ac54d9a8cc
Add issue form for zluda_trace logs (#432) 2025-09-03 14:34:08 -07:00
Andrzej Janik
6dd633d32a
Update quick start link in README.md (#493) 2025-09-03 14:18:28 -07:00
Andrzej Janik
a34a8da53f
Rework the documentation and landing page (#474) 2025-09-03 14:11:46 -07:00
Andrzej Janik
88b01c809e
Add small compiler fixes and a fake ptxas binary (#491) 2025-09-03 12:23:01 -07:00
Andrzej Janik
8a7a5b45be
Reorganize driver host tests, fix bugs around pointer host code (#492) 2025-09-03 12:22:07 -07:00
Violet
08f7e874e3
Update broken tests (#489)
Some checks are pending
ZLUDA / Build (Linux) (push) Waiting to run
ZLUDA / Build (Windows) (push) Waiting to run
ZLUDA / Build AMD GPU unit tests (push) Waiting to run
ZLUDA / Run AMD GPU unit tests (push) Blocked by required conditions
It seems like #484 broke the LLVM IR tests; this PR updates the .ll files to match what's now being produced.
2025-09-02 16:14:54 -07:00