Commit graph

202 commits

Author SHA1 Message Date
Andrzej Janik
85a0e600fc Fix exit handling in codegen 2024-05-16 19:29:37 +02:00
Andrzej Janik
922692d2fa Merge branch 'master' into meshroom
# Conflicts:
#	ptx/lib/zluda_ptx_impl.bc
#	ptx/src/ast.rs
#	ptx/src/emit.rs
#	ptx/src/ptx.lalrpop
#	ptx/src/test/spirv_run/mod.rs
#	ptx/src/translate.rs
#	zluda/src/cuda.rs
#	zluda/src/impl/surface.rs
2024-05-16 02:23:29 +02:00
NyanCatTW1
fcd7a57888
Fix + improve vprintf implementation (#211) 2024-05-16 00:38:52 +02:00
Andrzej Janik
f0c905db15
Fix trap instruction codegen, don't fail build with older Rust versions (#229) 2024-05-08 15:19:59 +02:00
Andrzej Janik
27c0e13677
Minor codegen improvements (#225) 2024-05-06 00:28:49 +02:00
Andrzej Janik
bdc652f9eb
Correctly report emulated wave32 CUDA device (#216) 2024-04-29 15:09:14 +02:00
Andrzej Janik
995bc95174
Build improvements (#206)
* Allow to create .zip package on Windows
* Allow to create .tar.gz package on Linux
* Add configuration for post-build Github CI
2024-04-28 01:22:43 +02:00
Andrzej Janik
5d5f7cca75
Rewrite surface implementation to more accurately support unofficial CUDA semantics (#203)
This fixes black screen in some CompuBench tests (TV-L1 Optical Flow) and other apps that use CUDA surfaces incorrectly
2024-04-14 02:39:34 +02:00
Andrzej Janik
774f4bcb37
Implement sad instruction (#198) 2024-04-06 01:23:53 +02:00
Andrzej Janik
0d9ace2475
Fix buggy carry flags when mixing subc/sub.cc with addc/add.cc (#197) 2024-04-05 23:26:08 +02:00
NyanCatTW1
76bae5f91b
Implement mad.hi.cc (#196) 2024-04-05 19:12:59 +02:00
Andrzej Janik
b695f44c18
Support old PTX compression scheme (#188) 2024-03-29 02:03:23 +01:00
Andrzej Janik
7d4147c8b2
Add Blender 4.2 support (#184)
Redo primary context and fix various long-standing bugs around this API
2024-03-28 17:12:10 +01:00
Andrzej Janik
8671f2e674 Document Meshroom usage 2024-03-26 19:48:40 +01:00
Andrzej Janik
a8bb615f1f Cosmetic test fix 2024-03-26 03:03:15 +01:00
Andrzej Janik
c64018db89 More hacks for mipmapped texobjs 2024-03-26 01:35:42 +01:00
Andrzej Janik
93987a2cfe Apply f16 hacks to texture objects and mipmapped arrays 2024-03-25 23:49:57 +01:00
Andrzej Janik
b7b8502859 Add failing test, make tiny fixes 2024-03-22 21:25:10 +01:00
Andrzej Janik
1ede61c696
Disable even more optional LLVM components (#179) 2024-03-17 14:53:15 +01:00
Andrzej Janik
f47a93a951
Fix reported build errors (#178) 2024-03-17 01:32:48 +01:00
Ikko Eltociear Ashimine
14a4016964
Update README.md (#166)
underying -> underlying
2024-03-08 01:35:05 +01:00
Andrzej Janik
f8db1b8c63 Add cuMipmappedArrayDestroy and cuMipmappedArrayGetLevel 2024-03-08 01:33:59 +01:00
Andrzej Janik
b0440bf9ba Add cuMipmappedArrayCreate 2024-03-07 18:38:30 +00:00
Andrzej Janik
f76516fa04 Resolve a bug with decl & and definition of static entry 2024-03-07 01:03:37 +00:00
Andrzej Janik
383dde6b35 Simplify compilation of globals in initalizers, fix bfind.u64 2024-03-03 17:26:23 +01:00
Andrzej Janik
4b4f33e29e Implement fn pointers in global initializers 2024-03-01 00:41:23 +01:00
Andrzej Janik
a1c265b7c2 Add failing test 2024-02-29 13:14:55 +01:00
Andrzej Janik
7d501f8d08 Add supprot for tex.level 2024-02-29 12:25:57 +01:00
Andrzej Janik
c910a85685 Implement .noreturn directive 2024-02-28 13:22:25 +01:00
Andrzej Janik
4363545d0e Implement isspacep 2024-02-27 19:57:17 +01:00
Seb Ospina
af0216b1a0
Fix adrenalin software link (#139)
The link that should be for AMD Adrenalin was pointing to ROCm linux info
2024-02-26 12:43:46 +01:00
Andrzej Janik
4a81dbffb5
Update llama.cpp support (#102)
Add sign extension support to prmt, allow set.<op>.f16x2.f16x2, add more BLAS mappings
2024-02-16 00:01:21 +01:00
Ikko Eltociear Ashimine
9f7be97ef6
Update README.md (#100)
uderlying -> underlying
2024-02-15 18:15:31 +01:00
Andrzej Janik
8d10f756a9
Add troubleshooting/debugging instructions (#91) 2024-02-15 13:25:52 +01:00
ManInDark
c884348427
Fixed typo in readme (#89) 2024-02-15 01:38:42 +01:00
Arna13
0c3bf2d9d0
Fixing typo in README.md (#63) 2024-02-13 21:57:51 +01:00
Sean McLemon
f2a44e0e05
Tidy up some English in ARCHITECTURE.md (#61) 2024-02-13 21:55:21 +01:00
Andrzej Janik
1b9ba2b233 Nobody expects the Red Team
Too many changes to list, but broadly:
* Remove Intel GPU support from the compiler
* Add AMD GPU support to the compiler
* Remove Intel GPU host code
* Add AMD GPU host code
* More device instructions. From 40 to 68
* More host functions. From 48 to 184
* Add proof of concept implementation of OptiX framework
* Add minimal support of cuDNN, cuBLAS, cuSPARSE, cuFFT, NCCL, NVML
* Improve ZLUDA launcher for Windows
2024-02-11 20:45:51 +01:00
Andrzej Janik
60d2124a16
Search for a new developer (#44) 2021-02-28 12:18:44 +01:00
Andrzej Janik
4d3e37befc
Update README.md (#42) 2021-02-22 01:32:04 +01:00
Andrzej Janik
a906c350f2
Make misc fixes (#41)
* Update ze_loader.lib to the newest version
* Export _ptsz/_ptds for which we have a legacy stream implementations
* Stop producing build logs if we are not looking at them anyway
2021-02-22 01:29:03 +01:00
Andrzej Janik
ab690c6491
Add zluda_redirect.dll to CI builds (#40) 2021-02-21 17:44:42 +01:00
Andrzej Janik
4ed9ef8edb
Improve CI (#39)
* Use official GPU driver packages for building on Linux
* Start building on Windows
* Start uploading artifacts
2021-02-21 14:44:58 +01:00
Andrzej Janik
36514bd6eb
Improve ZLUDA injection (#37)
Improve injector&redirector so it's no longer required to manually mess with files if the application links nvcuda.dll. Additionally inject into child processes
2021-02-20 21:40:19 +01:00
Andrzej Janik
972f612562
Fix signed integer conversion (#36)
This fixes the last remaining bug preventing end-to-end GeekBench run, so also update Geekbench results in README
2021-01-26 21:05:09 +01:00
Andrzej Janik
3e2e73ac33 Add script for replaying dumped kernel (#34)
zluda_dump can already create traces of GPU execution, this script can replay those traces.
Additionally, changed added just enough code in core ZLUDA to support simple PyCUDAexecution
2021-01-23 16:57:07 +01:00
Andrzej Janik
ff8135e8a3
Add a library for dumping kernels arguments before and after launch (#18) 2021-01-16 22:28:48 +01:00
Andrzej Janik
09f679693b
Prevent linker from stripping exports on Linux (#33) 2021-01-15 01:17:44 +01:00
Andrzej Janik
5cd9a5fbc4
Add empty implementation of cuDeviceGetLuid (#30)
This function is required by recent versions of CUDA runtime on Windows
2021-01-08 19:43:46 +01:00
Andrzej Janik
237a6c113a
Regenerate SPIR-V tests (#29)
In one of the previous commits we made a change to mark ld/st as aligned. This change was not propagated to test files
2021-01-08 19:06:11 +01:00