Andrzej Janik
044fab47e5
Update devcontainer ( #507 )
ZLUDA / Build (Linux) (push) Waiting to run
ZLUDA / Build (Windows) (push) Waiting to run
ZLUDA / Build AMD GPU unit tests (push) Waiting to run
ZLUDA / Run AMD GPU unit tests (push) Blocked by required conditions
2025-09-14 19:03:14 -07:00
Violet
e3a99b7ee1
Fix min.ftz.nan.f16 for ROCm 6.3.4 ( #506 )
...
ZLUDA / Build (Linux) (push) Has been cancelled
ZLUDA / Build (Windows) (push) Has been cancelled
ZLUDA / Build AMD GPU unit tests (push) Has been cancelled
ZLUDA / Run AMD GPU unit tests (push) Has been cancelled
This PR fixes compatibility issues with ROCm 6.3.4 by replacing the unavailable llvm.minimum.f16 intrinsic with an equivalent implementation using llvm.minnum.f16 combined with NaN handling logic. Same applies to max operations
This also adds `"amdgpu-ieee"="false"` attribute everywhere, which gives us better codegen for min/max everywhere
2025-09-12 16:48:47 -07:00
Andrzej Janik
6c811a55d2
Random fixes ( #504 )
...
This is a collection of random changes coming from the workload I'm working on. The most important change is better support for `.params`: PTX uses .param namespace both for some local variables and kernel args. This is a problem for us because those are different address spaces on AMDGPU. So far we've made an effort to convert to local and const namespaces whenever possible, but this commit tries to handle more patterns, which are impossible to track precisely, by converting to generic space.
2025-09-12 13:52:33 -07:00
Violet
7b5fdb30c4
Implement ldmatrix ( #503 )
ZLUDA / Build (Linux) (push) Has been cancelled
ZLUDA / Build (Windows) (push) Has been cancelled
ZLUDA / Build AMD GPU unit tests (push) Has been cancelled
ZLUDA / Run AMD GPU unit tests (push) Has been cancelled
2025-09-09 19:31:56 -07:00
Violet
3da39364e0
Make blame ignore formatting commit ( #502 )
2025-09-09 13:12:31 -07:00
Violet
d81456a549
Add support for cvt_rn_bf16x2_f32 ( #501 )
ZLUDA / Build (Linux) (push) Waiting to run
ZLUDA / Build (Windows) (push) Waiting to run
ZLUDA / Build AMD GPU unit tests (push) Waiting to run
ZLUDA / Run AMD GPU unit tests (push) Blocked by required conditions
2025-09-08 17:41:24 -07:00
Violet
d342e1a06e
Implement redux.sync for u32 and s32 ( #500 )
2025-09-08 16:13:28 -07:00
Andrzej Janik
869d291099
Progress compilation despite parsing errors ( #495 )
...
Previously if we ran into a broken instruction we'd fail whole compilation. This PR changes it so (only in Release mode) we try and progress at all cost. Meaning that if we had trouble parsing an instruction we just remove function form the output and continue.
For some workloads we can still compile a semi-broken, but meaningful subset of a module
2025-09-08 14:35:29 -07:00
Violet
00ca92167d
Remove accidentally committed file ( #499 )
2025-09-08 13:18:11 -07:00
Violet
4306646739
Support immediates in vector operands ( #488 )
ZLUDA / Build (Linux) (push) Waiting to run
ZLUDA / Build (Windows) (push) Waiting to run
ZLUDA / Build AMD GPU unit tests (push) Waiting to run
ZLUDA / Run AMD GPU unit tests (push) Blocked by required conditions
2025-09-08 10:26:58 -07:00
Violet
b11c17a48b
Fix const_ident.ll ( #497 )
...
Minor fix to .ptx was not updated in .ll
2025-09-08 09:49:59 -07:00
Violet
e7f10afb51
Use Vec<RegOrImmediate>
as const/global variable initializer ( #490 )
ZLUDA / Build (Linux) (push) Has been cancelled
ZLUDA / Build (Windows) (push) Has been cancelled
ZLUDA / Build AMD GPU unit tests (push) Has been cancelled
ZLUDA / Run AMD GPU unit tests (push) Has been cancelled
2025-09-05 16:41:41 -07:00
Violet
b7f3a647d7
Implement fma.rn.fn.bf16x2
( #496 )
...
ZLUDA / Build (Linux) (push) Waiting to run
ZLUDA / Build (Windows) (push) Waiting to run
ZLUDA / Build AMD GPU unit tests (push) Waiting to run
ZLUDA / Run AMD GPU unit tests (push) Blocked by required conditions
* Add fma bf16x2 test
* Implement fma.rn.fn.bf16x2
* cargo fmt
2025-09-04 17:29:20 -07:00
Andrzej Janik
5309065cc1
Enhance bug report template with ZLUDA version input ( #494 )
...
ZLUDA / Build (Linux) (push) Has been cancelled
ZLUDA / Build (Windows) (push) Has been cancelled
ZLUDA / Build AMD GPU unit tests (push) Has been cancelled
ZLUDA / Run AMD GPU unit tests (push) Has been cancelled
Added version input field to bug report template with a note on version support.
2025-09-03 16:20:34 -07:00
Violet
ac54d9a8cc
Add issue form for zluda_trace logs ( #432 )
2025-09-03 14:34:08 -07:00
Andrzej Janik
6dd633d32a
Update quick start link in README.md ( #493 )
2025-09-03 14:18:28 -07:00
Andrzej Janik
a34a8da53f
Rework the documentation and landing page ( #474 )
2025-09-03 14:11:46 -07:00
Andrzej Janik
88b01c809e
Add small compiler fixes and a fake ptxas binary ( #491 )
2025-09-03 12:23:01 -07:00
Andrzej Janik
8a7a5b45be
Reorganize driver host tests, fix bugs around pointer host code ( #492 )
2025-09-03 12:22:07 -07:00
Violet
08f7e874e3
Update broken tests ( #489 )
...
ZLUDA / Build (Linux) (push) Waiting to run
ZLUDA / Build (Windows) (push) Waiting to run
ZLUDA / Build AMD GPU unit tests (push) Waiting to run
ZLUDA / Run AMD GPU unit tests (push) Blocked by required conditions
It seems like #484 broke the LLVM IR tests; this PR updates the .ll files to match what's now being produced.
2025-09-02 16:14:54 -07:00
aiwhskruht
4752fcdcf2
Api traits test code ( #487 )
...
Add initial templated API test support. This needs to be improved to use an attribute macro, but that will require some major surgery :(
2025-09-02 08:57:54 -07:00
Andrzej Janik
9d4f1699d0
Do a better job in zluda_trace when saving opaque ELF binaries ( #486 )
ZLUDA / Build (Linux) (push) Has been cancelled
ZLUDA / Build (Windows) (push) Has been cancelled
ZLUDA / Build AMD GPU unit tests (push) Has been cancelled
ZLUDA / Run AMD GPU unit tests (push) Has been cancelled
2025-08-28 18:23:25 -07:00
Andrzej Janik
ea99dcc0b0
Implement vote instruction and add support for %laneid ( #484 )
2025-08-28 18:23:09 -07:00
Violet
8f484d6a5f
Add support for fp8 to cvt
( #468 )
...
This implements specifically the fp8 conversion instructions needed by llm.c:
* `cvt.rn.satfinite{.relu}.f8x2type.f32`
* `cvt.rn{.relu}.f16x2.f8x2type`
It uses HIP's fp8 and fp16 headers: https://rocm.docs.amd.com/projects/HIP/en/docs-develop/reference/low_fp_types.html#fp8-quarter-precision .
2025-08-28 17:54:07 -07:00
Andrzej Janik
3632f2bf03
Some fixes to BLASLt ( #482 )
ZLUDA / Build (Linux) (push) Has been cancelled
ZLUDA / Build (Windows) (push) Has been cancelled
ZLUDA / Build AMD GPU unit tests (push) Has been cancelled
ZLUDA / Run AMD GPU unit tests (push) Has been cancelled
2025-08-26 14:28:36 -07:00
Andrzej Janik
ec1358af1c
Add more NVML and cuBLAS coverage ( #481 )
ZLUDA / Build (Linux) (push) Waiting to run
ZLUDA / Build (Windows) (push) Waiting to run
ZLUDA / Build AMD GPU unit tests (push) Waiting to run
ZLUDA / Run AMD GPU unit tests (push) Blocked by required conditions
2025-08-26 12:55:12 -07:00
Andrzej Janik
62d340e4bd
Add more host-side functionality ( #480 )
ZLUDA / Build (Linux) (push) Waiting to run
ZLUDA / Build (Windows) (push) Waiting to run
ZLUDA / Build AMD GPU unit tests (push) Waiting to run
ZLUDA / Run AMD GPU unit tests (push) Blocked by required conditions
2025-08-25 16:55:17 -07:00
Violet
de319f7c00
Add test for conversion from .f16x2 to .b32 ( #479 )
2025-08-25 15:33:53 -07:00
Violet
00eb553454
Add pass test mechanism for insert_implicit_conversions ( #477 )
ZLUDA / Build (Linux) (push) Has been cancelled
ZLUDA / Build (Windows) (push) Has been cancelled
ZLUDA / Build AMD GPU unit tests (push) Has been cancelled
ZLUDA / Run AMD GPU unit tests (push) Has been cancelled
2025-08-22 13:01:39 -07:00
Andrzej Janik
e805cb72a5
Add nvml tracing ( #476 )
ZLUDA / Build (Linux) (push) Has been cancelled
ZLUDA / Build (Windows) (push) Has been cancelled
ZLUDA / Build AMD GPU unit tests (push) Has been cancelled
ZLUDA / Run AMD GPU unit tests (push) Has been cancelled
2025-08-18 23:09:15 -07:00
Andrzej Janik
547536de4a
Add more instructions, tighten generated assembly ( #475 )
ZLUDA / Build (Linux) (push) Waiting to run
ZLUDA / Build (Windows) (push) Waiting to run
ZLUDA / Build AMD GPU unit tests (push) Waiting to run
ZLUDA / Run AMD GPU unit tests (push) Blocked by required conditions
2025-08-18 11:12:57 -07:00
Andrzej Janik
65367f04ee
Fix how full-precision fp32 sqrt and div are handled ( #467 )
...
ZLUDA / Build (Linux) (push) Has been cancelled
ZLUDA / Build (Windows) (push) Has been cancelled
ZLUDA / Build AMD GPU unit tests (push) Has been cancelled
ZLUDA / Run AMD GPU unit tests (push) Has been cancelled
Previously, when compiling full precision `sqrt`/`div` we'd leave it to the LLVM. LLVM looks at module's `denormal-fp-math-f32` mode, which is incompatible with how we handle denormals and could give wrong results in certain edge cases.
Instead handle it fully inside ZLUDA
2025-08-14 17:24:40 -07:00
Violet
a420601128
Add test for unrecognized statement error with vector braces ( #472 )
...
The old code using `take_till_inclusive` assumed that a right brace would be the end of a block and therefore never part of a statement. However, some PTX statements can include vector operands. This meant that any unrecognized statement with a vector operand would backtrace and eventually produce an unhelpful context error rather than an `UnrecognizedStatement` error.
This pull request also adds a mechanism for testing parser errors.
2025-08-13 17:23:51 -07:00
Joëlle van Essen
fe7a18f912
zoc (ZLUDA offline compiler) ( #344 )
2025-08-13 15:27:02 -07:00
Andrzej Janik
28eca3d75a
Implement kernel cache ( #465 )
ZLUDA / Build (Linux) (push) Has been cancelled
ZLUDA / Build (Windows) (push) Has been cancelled
ZLUDA / Build AMD GPU unit tests (push) Has been cancelled
ZLUDA / Run AMD GPU unit tests (push) Has been cancelled
2025-08-11 09:20:40 -07:00
Violet
d2f92e4267
More tracing for custom parsers ( #471 )
ZLUDA / Build (Linux) (push) Has been cancelled
ZLUDA / Build (Windows) (push) Has been cancelled
ZLUDA / Build AMD GPU unit tests (push) Has been cancelled
ZLUDA / Run AMD GPU unit tests (push) Has been cancelled
2025-08-08 12:19:27 -07:00
Violet
ef98c1e0ba
Add tracing to custom parsers ( #469 )
ZLUDA / Build (Linux) (push) Waiting to run
ZLUDA / Build (Windows) (push) Waiting to run
ZLUDA / Build AMD GPU unit tests (push) Waiting to run
ZLUDA / Run AMD GPU unit tests (push) Blocked by required conditions
2025-08-07 11:51:56 -07:00
Violet
94eec34bdb
More descriptive syntax errors ( #466 )
...
ZLUDA / Build (Linux) (push) Has been cancelled
ZLUDA / Build (Windows) (push) Has been cancelled
ZLUDA / Build AMD GPU unit tests (push) Has been cancelled
ZLUDA / Run AMD GPU unit tests (push) Has been cancelled
* More descriptive syntax errors
* cargo fmt
* Remove brackets
2025-08-05 17:23:55 -07:00
Andrzej Janik
a1b7600718
Fix version in nightly trigger ( #464 )
ZLUDA / Build (Windows) (push) Has been cancelled
ZLUDA / Build (Linux) (push) Has been cancelled
ZLUDA / Build AMD GPU unit tests (push) Has been cancelled
ZLUDA / Run AMD GPU unit tests (push) Has been cancelled
2025-08-04 13:09:37 -07:00
Violet
dd05752fc4
Implement shf
instruction ( #463 )
...
* Implement `shf` instruction
* Tests for `shf`
* cargo fmt
2025-08-04 10:15:46 -07:00
Andrzej Janik
ce9c0aac23
In nightly runs, make sure cargo is in $PATH ( #462 )
ZLUDA / Build (Linux) (push) Has been cancelled
ZLUDA / Build (Windows) (push) Has been cancelled
ZLUDA / Build AMD GPU unit tests (push) Has been cancelled
ZLUDA / Run AMD GPU unit tests (push) Has been cancelled
2025-08-01 09:36:11 -07:00
Andrzej Janik
c00496b92a
Install curl correctly ( #461 )
ZLUDA / Run AMD GPU unit tests (push) Blocked by required conditions
ZLUDA / Build (Linux) (push) Waiting to run
ZLUDA / Build (Windows) (push) Waiting to run
ZLUDA / Build AMD GPU unit tests (push) Waiting to run
2025-07-31 19:56:37 -07:00
Andrzej Janik
7fc6caffb6
Update nightly runner's ROCm version to 6.3.4 ( #460 )
2025-07-31 19:46:49 -07:00
Andrzej Janik
52d3ea624c
Fix nightly run ( #459 )
2025-07-31 19:20:39 -07:00
Andrzej Janik
704a94e6f4
Fix nightly tests trigger ( #458 )
2025-07-31 18:31:54 -07:00
Andrzej Janik
cd7e2f8e36
Force loading ZLUDA through LD_PRELOAD ( #447 )
...
Certain applications (pytorch) decide that it's a great idea to distribute whole CUDA driver and link to it with DT_RPATH. This igores LD_LIBRARY_PATH.
This code defeats that evil mechanism through any means necessary
2025-07-31 18:00:13 -07:00
Violet
96ae27e9e1
Implement cublas functions needed for llm.c ( #457 )
ZLUDA / Build (Linux) (push) Waiting to run
ZLUDA / Build (Windows) (push) Waiting to run
ZLUDA / Build AMD GPU unit tests (push) Waiting to run
ZLUDA / Run AMD GPU unit tests (push) Blocked by required conditions
2025-07-31 11:08:53 -07:00
Violet
99c36092be
Use FromCuda
in zluda_blas
( #455 )
2025-07-31 09:52:10 -07:00
Andrzej Janik
49aabffdcc
Rename zluda_dump to zluda_trace ( #456 )
...
* Rename zluda_dump to zluda_trace
* Minor naming fixes
2025-07-31 08:07:03 -07:00
Violet
4d163a4d9b
Implement cuModuleGetGlobal_v2
( #454 )
ZLUDA / Build (Linux) (push) Waiting to run
ZLUDA / Build (Windows) (push) Waiting to run
ZLUDA / Build AMD GPU unit tests (push) Waiting to run
ZLUDA / Run AMD GPU unit tests (push) Blocked by required conditions
2025-07-30 16:34:21 -07:00