Violet
08f7e874e3
Update broken tests ( #489 )
...
ZLUDA / Build (Linux) (push) Waiting to run
ZLUDA / Build (Windows) (push) Waiting to run
ZLUDA / Build AMD GPU unit tests (push) Waiting to run
ZLUDA / Run AMD GPU unit tests (push) Blocked by required conditions
It seems like #484 broke the LLVM IR tests; this PR updates the .ll files to match what's now being produced.
2025-09-02 16:14:54 -07:00
aiwhskruht
4752fcdcf2
Api traits test code ( #487 )
...
Add initial templated API test support. This needs to be improved to use an attribute macro, but that will require some major surgery :(
2025-09-02 08:57:54 -07:00
Andrzej Janik
9d4f1699d0
Do a better job in zluda_trace when saving opaque ELF binaries ( #486 )
ZLUDA / Build (Linux) (push) Has been cancelled
ZLUDA / Build (Windows) (push) Has been cancelled
ZLUDA / Build AMD GPU unit tests (push) Has been cancelled
ZLUDA / Run AMD GPU unit tests (push) Has been cancelled
2025-08-28 18:23:25 -07:00
Andrzej Janik
ea99dcc0b0
Implement vote instruction and add support for %laneid ( #484 )
2025-08-28 18:23:09 -07:00
Violet
8f484d6a5f
Add support for fp8 to cvt
( #468 )
...
This implements specifically the fp8 conversion instructions needed by llm.c:
* `cvt.rn.satfinite{.relu}.f8x2type.f32`
* `cvt.rn{.relu}.f16x2.f8x2type`
It uses HIP's fp8 and fp16 headers: https://rocm.docs.amd.com/projects/HIP/en/docs-develop/reference/low_fp_types.html#fp8-quarter-precision .
2025-08-28 17:54:07 -07:00
Andrzej Janik
3632f2bf03
Some fixes to BLASLt ( #482 )
ZLUDA / Build (Linux) (push) Has been cancelled
ZLUDA / Build (Windows) (push) Has been cancelled
ZLUDA / Build AMD GPU unit tests (push) Has been cancelled
ZLUDA / Run AMD GPU unit tests (push) Has been cancelled
2025-08-26 14:28:36 -07:00
Andrzej Janik
ec1358af1c
Add more NVML and cuBLAS coverage ( #481 )
ZLUDA / Build (Linux) (push) Waiting to run
ZLUDA / Build (Windows) (push) Waiting to run
ZLUDA / Build AMD GPU unit tests (push) Waiting to run
ZLUDA / Run AMD GPU unit tests (push) Blocked by required conditions
2025-08-26 12:55:12 -07:00
Andrzej Janik
62d340e4bd
Add more host-side functionality ( #480 )
ZLUDA / Build (Linux) (push) Waiting to run
ZLUDA / Build (Windows) (push) Waiting to run
ZLUDA / Build AMD GPU unit tests (push) Waiting to run
ZLUDA / Run AMD GPU unit tests (push) Blocked by required conditions
2025-08-25 16:55:17 -07:00
Violet
de319f7c00
Add test for conversion from .f16x2 to .b32 ( #479 )
2025-08-25 15:33:53 -07:00
Violet
00eb553454
Add pass test mechanism for insert_implicit_conversions ( #477 )
ZLUDA / Build (Linux) (push) Has been cancelled
ZLUDA / Build (Windows) (push) Has been cancelled
ZLUDA / Build AMD GPU unit tests (push) Has been cancelled
ZLUDA / Run AMD GPU unit tests (push) Has been cancelled
2025-08-22 13:01:39 -07:00
Andrzej Janik
e805cb72a5
Add nvml tracing ( #476 )
ZLUDA / Build (Linux) (push) Has been cancelled
ZLUDA / Build (Windows) (push) Has been cancelled
ZLUDA / Build AMD GPU unit tests (push) Has been cancelled
ZLUDA / Run AMD GPU unit tests (push) Has been cancelled
2025-08-18 23:09:15 -07:00
Andrzej Janik
547536de4a
Add more instructions, tighten generated assembly ( #475 )
ZLUDA / Build (Linux) (push) Waiting to run
ZLUDA / Build (Windows) (push) Waiting to run
ZLUDA / Build AMD GPU unit tests (push) Waiting to run
ZLUDA / Run AMD GPU unit tests (push) Blocked by required conditions
2025-08-18 11:12:57 -07:00
Andrzej Janik
65367f04ee
Fix how full-precision fp32 sqrt and div are handled ( #467 )
...
ZLUDA / Build (Linux) (push) Has been cancelled
ZLUDA / Build (Windows) (push) Has been cancelled
ZLUDA / Build AMD GPU unit tests (push) Has been cancelled
ZLUDA / Run AMD GPU unit tests (push) Has been cancelled
Previously, when compiling full precision `sqrt`/`div` we'd leave it to the LLVM. LLVM looks at module's `denormal-fp-math-f32` mode, which is incompatible with how we handle denormals and could give wrong results in certain edge cases.
Instead handle it fully inside ZLUDA
2025-08-14 17:24:40 -07:00
Violet
a420601128
Add test for unrecognized statement error with vector braces ( #472 )
...
The old code using `take_till_inclusive` assumed that a right brace would be the end of a block and therefore never part of a statement. However, some PTX statements can include vector operands. This meant that any unrecognized statement with a vector operand would backtrace and eventually produce an unhelpful context error rather than an `UnrecognizedStatement` error.
This pull request also adds a mechanism for testing parser errors.
2025-08-13 17:23:51 -07:00
Joëlle van Essen
fe7a18f912
zoc (ZLUDA offline compiler) ( #344 )
2025-08-13 15:27:02 -07:00
Andrzej Janik
28eca3d75a
Implement kernel cache ( #465 )
ZLUDA / Build (Linux) (push) Has been cancelled
ZLUDA / Build (Windows) (push) Has been cancelled
ZLUDA / Build AMD GPU unit tests (push) Has been cancelled
ZLUDA / Run AMD GPU unit tests (push) Has been cancelled
2025-08-11 09:20:40 -07:00
Violet
d2f92e4267
More tracing for custom parsers ( #471 )
ZLUDA / Build (Linux) (push) Has been cancelled
ZLUDA / Build (Windows) (push) Has been cancelled
ZLUDA / Build AMD GPU unit tests (push) Has been cancelled
ZLUDA / Run AMD GPU unit tests (push) Has been cancelled
2025-08-08 12:19:27 -07:00
Violet
ef98c1e0ba
Add tracing to custom parsers ( #469 )
ZLUDA / Build (Linux) (push) Waiting to run
ZLUDA / Build (Windows) (push) Waiting to run
ZLUDA / Build AMD GPU unit tests (push) Waiting to run
ZLUDA / Run AMD GPU unit tests (push) Blocked by required conditions
2025-08-07 11:51:56 -07:00
Violet
94eec34bdb
More descriptive syntax errors ( #466 )
...
ZLUDA / Build (Linux) (push) Has been cancelled
ZLUDA / Build (Windows) (push) Has been cancelled
ZLUDA / Build AMD GPU unit tests (push) Has been cancelled
ZLUDA / Run AMD GPU unit tests (push) Has been cancelled
* More descriptive syntax errors
* cargo fmt
* Remove brackets
2025-08-05 17:23:55 -07:00
Andrzej Janik
a1b7600718
Fix version in nightly trigger ( #464 )
ZLUDA / Build (Windows) (push) Has been cancelled
ZLUDA / Build (Linux) (push) Has been cancelled
ZLUDA / Build AMD GPU unit tests (push) Has been cancelled
ZLUDA / Run AMD GPU unit tests (push) Has been cancelled
2025-08-04 13:09:37 -07:00
Violet
dd05752fc4
Implement shf
instruction ( #463 )
...
* Implement `shf` instruction
* Tests for `shf`
* cargo fmt
2025-08-04 10:15:46 -07:00
Andrzej Janik
ce9c0aac23
In nightly runs, make sure cargo is in $PATH ( #462 )
ZLUDA / Build (Linux) (push) Has been cancelled
ZLUDA / Build (Windows) (push) Has been cancelled
ZLUDA / Build AMD GPU unit tests (push) Has been cancelled
ZLUDA / Run AMD GPU unit tests (push) Has been cancelled
2025-08-01 09:36:11 -07:00
Andrzej Janik
c00496b92a
Install curl correctly ( #461 )
ZLUDA / Run AMD GPU unit tests (push) Blocked by required conditions
ZLUDA / Build (Linux) (push) Waiting to run
ZLUDA / Build (Windows) (push) Waiting to run
ZLUDA / Build AMD GPU unit tests (push) Waiting to run
2025-07-31 19:56:37 -07:00
Andrzej Janik
7fc6caffb6
Update nightly runner's ROCm version to 6.3.4 ( #460 )
2025-07-31 19:46:49 -07:00
Andrzej Janik
52d3ea624c
Fix nightly run ( #459 )
2025-07-31 19:20:39 -07:00
Andrzej Janik
704a94e6f4
Fix nightly tests trigger ( #458 )
2025-07-31 18:31:54 -07:00
Andrzej Janik
cd7e2f8e36
Force loading ZLUDA through LD_PRELOAD ( #447 )
...
Certain applications (pytorch) decide that it's a great idea to distribute whole CUDA driver and link to it with DT_RPATH. This igores LD_LIBRARY_PATH.
This code defeats that evil mechanism through any means necessary
2025-07-31 18:00:13 -07:00
Violet
96ae27e9e1
Implement cublas functions needed for llm.c ( #457 )
ZLUDA / Build (Linux) (push) Waiting to run
ZLUDA / Build (Windows) (push) Waiting to run
ZLUDA / Build AMD GPU unit tests (push) Waiting to run
ZLUDA / Run AMD GPU unit tests (push) Blocked by required conditions
2025-07-31 11:08:53 -07:00
Violet
99c36092be
Use FromCuda
in zluda_blas
( #455 )
2025-07-31 09:52:10 -07:00
Andrzej Janik
49aabffdcc
Rename zluda_dump to zluda_trace ( #456 )
...
* Rename zluda_dump to zluda_trace
* Minor naming fixes
2025-07-31 08:07:03 -07:00
Violet
4d163a4d9b
Implement cuModuleGetGlobal_v2
( #454 )
ZLUDA / Build (Linux) (push) Waiting to run
ZLUDA / Build (Windows) (push) Waiting to run
ZLUDA / Build AMD GPU unit tests (push) Waiting to run
ZLUDA / Run AMD GPU unit tests (push) Blocked by required conditions
2025-07-30 16:34:21 -07:00
Violet
66db19a061
Move FromCuda
and ZludaObject
into a common crate ( #452 )
...
* Refactor FromCuda error type to be generic
* Create zluda_common crate
* Move FromCuda trait into zluda_common
* Write some doc comments
* Fix typo
* Edit comment
* Fix formatting
2025-07-30 15:53:22 -07:00
Violet
b8bcbec295
Always use Unix line endings ( #453 )
2025-07-30 15:09:47 -07:00
Violet
21ef5f60a3
Check Rust formatting on pull requests ( #451 )
...
* Check Rust formatting on pull requests
This should help us maintain consistent style, without having unrelated style changes in pull requests from running `rustfmt`.
* cargo fmt non-generated files
* Ignore generated files
2025-07-30 14:55:09 -07:00
Violet
98b601d15a
Use normalize_fn
for performance libraries ( #449 )
...
ZLUDA / Run AMD GPU unit tests (push) Blocked by required conditions
ZLUDA / Build (Linux) (push) Waiting to run
ZLUDA / Build (Windows) (push) Waiting to run
ZLUDA / Build AMD GPU unit tests (push) Waiting to run
The goal here is to make the performance library implementations work more like zluda.
2025-07-30 14:02:01 -07:00
Violet
c07d7678cd
Format files ( #450 )
2025-07-30 10:30:49 -07:00
Violet
481c3550fa
Convert CUDA performance lib statuses to Rust result types ( #444 )
...
ZLUDA / Build (Linux) (push) Waiting to run
ZLUDA / Build (Windows) (push) Waiting to run
ZLUDA / Build AMD GPU unit tests (push) Waiting to run
ZLUDA / Run AMD GPU unit tests (push) Blocked by required conditions
These changes replicate how the main library is handled. cuDNN still needs to have zluda_bindgen run and zluda_dump_common updated
2025-07-29 14:28:14 -07:00
Violet
303e4c2fb2
Update rocm_setup_build.sh ( #446 )
ZLUDA / Build (Linux) (push) Waiting to run
ZLUDA / Build (Windows) (push) Waiting to run
ZLUDA / Build AMD GPU unit tests (push) Waiting to run
ZLUDA / Run AMD GPU unit tests (push) Blocked by required conditions
2025-07-29 11:05:16 -07:00
Violet
4ffa669cce
Fix Windows linkage ( #445 )
2025-07-29 10:15:32 -07:00
Violet
d81404eb70
Add support for rocblas
to zluda_bindgen
( #440 )
...
ZLUDA / Build (Linux) (push) Waiting to run
ZLUDA / Build (Windows) (push) Waiting to run
ZLUDA / Build AMD GPU unit tests (push) Waiting to run
ZLUDA / Run AMD GPU unit tests (push) Blocked by required conditions
One step of several for adding cublas support
2025-07-28 15:07:22 -07:00
Andrzej Janik
8dbc7208de
Try to make ZLUDA more robust on Windows ( #442 )
...
On my machine ZLUDA seems to segfault when initializing LLVM's C++ statics in Blender. Blender ships with C++ runtime. It seems that compiling C++ runtime statically fixes the issue. Might be actually unrelated.
Additionally, dtor crate on Windows seem to use a slightly dodgy method, so replace it with something more straightforward
2025-07-28 13:20:04 -07:00
Violet
f192dd317a
Use implicit FromCuda for library::get_module ( #439 )
ZLUDA / Build (Linux) (push) Waiting to run
ZLUDA / Build (Windows) (push) Waiting to run
ZLUDA / Build AMD GPU unit tests (push) Waiting to run
ZLUDA / Run AMD GPU unit tests (push) Blocked by required conditions
2025-07-28 06:42:14 -07:00
Violet
8c23ef1ded
Rename cuda_base cuda_macros ( #435 )
ZLUDA / Build (Linux) (push) Has been cancelled
ZLUDA / Build (Windows) (push) Has been cancelled
ZLUDA / Build AMD GPU unit tests (push) Has been cancelled
ZLUDA / Run AMD GPU unit tests (push) Has been cancelled
2025-07-25 11:09:50 -07:00
Violet
ba38da0bbc
Silence unused variable warnings ( #434 )
...
I'd left these in originally because I'd assumed that these functions would need full implementations soon, but they're really annoying. I've fixed all the other compiler warnings as well.
2025-07-25 10:28:33 -07:00
Andrzej Janik
c1dda55235
Add nightly tests ( #433 )
ZLUDA / Build (Linux) (push) Waiting to run
ZLUDA / Build AMD GPU unit tests (push) Waiting to run
ZLUDA / Run AMD GPU unit tests (push) Blocked by required conditions
ZLUDA / Build (Windows) (push) Waiting to run
2025-07-24 16:14:06 -07:00
Andrzej Janik
e8e20294a6
Set newly created context as current ( #431 )
ZLUDA / Build (Linux) (push) Waiting to run
ZLUDA / Build (Windows) (push) Waiting to run
ZLUDA / Build AMD GPU unit tests (push) Waiting to run
ZLUDA / Run AMD GPU unit tests (push) Blocked by required conditions
2025-07-24 11:05:04 -07:00
Andrzej Janik
5deada8426
Add cuCtxCreate_v2 and cuCtxDestroy_v2 ( #430 )
ZLUDA / Build (Windows) (push) Waiting to run
ZLUDA / Build AMD GPU unit tests (push) Waiting to run
ZLUDA / Run AMD GPU unit tests (push) Blocked by required conditions
ZLUDA / Build (Linux) (push) Waiting to run
2025-07-23 17:33:59 -07:00
Violet
2b90fdb56c
Add support for cp.async
( #427 )
...
Adds support for
* `cp.async`
* `cp.async.commit_group`
* `cp.async.wait_group`
* `cp.async.wait_all`
Asynchronous copy operations are only supported by AMD Instinct GPUs, so for now we lower them as synchronous copy operations. Because of this, `cp.async.commit_group`, `cp.async.wait_group`, and `cp.async.wait_all` are no-op.
2025-07-23 16:25:49 -07:00
Andrzej Janik
3746079b1a
Assorted instruction fixes ( #423 )
...
This fixes transcendentals and some other buggy instructions exposed by `ptx_tests` (abs, neg). Add (slow - hardware limitation) tanh.
Only two remaining incorrect instructions are div and sqrt with non-default rounding, but this commit is already bloated enough
2025-07-23 15:50:35 -07:00
Violet
119b635b9d
Emit correct alignment for loads and stores ( #429 )
2025-07-23 14:55:52 -07:00