Commit graph

343 commits

Author SHA1 Message Date
Andrzej Janik
82b5cef0bd Carry state space with pointer 2021-05-15 15:58:11 +02:00
Andrzej Janik
425edfcdd4 Simplify typing 2021-05-07 18:22:09 +02:00
Andrzej Janik
7f051ad20e Fix and test 2021-05-06 01:32:45 +02:00
Andrzej Janik
9d92a6e284 Start converting the translation to one type type 2021-05-05 22:56:58 +02:00
Andrzej Janik
d51aaaf552 Throw away special variable types 2021-04-17 14:01:50 +02:00
Andrzej Janik
a55c851eaa Add comment 2021-04-15 20:01:01 +02:00
Andrzej Janik
8cd3db6648 Remove LdStType 2021-04-15 19:53:54 +02:00
Andrzej Janik
4d04fe251d Remove all remaining subenums 2021-04-15 19:21:52 +02:00
Andrzej Janik
a0baad9456 Convert enumes to 1TT 2021-04-15 19:10:45 +02:00
Andrzej Janik
a005c92c61 Index from 0 2021-04-12 01:07:45 +02:00
Andrzej Janik
fedf88180a Dump all modules, even if not enqueued 2021-04-12 00:42:35 +02:00
Andrzej Janik
96f95d59ce Make zluda_dump more robust 2021-04-12 00:18:27 +02:00
Andrzej Janik
a39dda67d1 Make dumper compatible with older versions of CUDA 2021-04-10 23:01:01 +02:00
Andrzej Janik
8393dbd6e9 More fixes for 32bit 2021-04-09 22:00:23 +02:00
Andrzej Janik
9dcfb45aa2 Make dumper 32-bit compatible 2021-04-09 21:34:41 +02:00
Andrzej Janik
94af72f46b Fix 32-bit builds 2021-04-09 20:32:37 +02:00
Andrzej Janik
15f465041d Implement setp.nan and setp.num 2021-03-03 23:35:18 +01:00
Andrzej Janik
17291019e3 Implement atomic float add 2021-03-03 22:41:47 +01:00
Andrzej Janik
efd91e270c Implement non-coherent loads and implicit sign-extending conversions 2021-03-03 21:22:31 +01:00
Andrzej Janik
cdac38d572 Support kernel tuning directives 2021-03-03 00:59:47 +01:00
Andrzej Janik
648035a01a Update rspirv/spirv_headers to the newest version 2021-03-02 01:42:23 +01:00
Andrzej Janik
178ec59af6 Implement bfi instruction 2021-03-01 23:01:53 +01:00
Andrzej Janik
d3cd2dc8b4 Do slightly better when it comes to PTX error recovery 2021-03-01 02:24:27 +01:00
Andrzej Janik
eec55d9d02 Inform about ELF binaries in dumper 2021-02-28 12:49:25 +01:00
Andrzej Janik
06a5cff2d8 Add our nvml to the build 2021-02-28 02:11:22 +01:00
Andrzej Janik
088ff760de Tell linguist to stop counting third-party code 2021-02-28 02:08:22 +01:00
Andrzej Janik
ba83bb28f7 Inject our own NVML 2021-02-28 01:50:04 +01:00
Andrzej Janik
b7ee6d66c3 Implement enough nvml to make GeekBench happy 2021-02-28 00:46:50 +01:00
Andrzej Janik
871b8d1bef Update level_zero-sys with the newest extension 2021-02-27 21:24:01 +01:00
Andrzej Janik
bfae2e0d21 Allow overriding device compute version in dumper 2021-02-27 20:55:19 +01:00
Andrzej Janik
4d3e37befc
Update README.md (#42) 2021-02-22 01:32:04 +01:00
Andrzej Janik
a906c350f2
Make misc fixes (#41)
* Update ze_loader.lib to the newest version
* Export _ptsz/_ptds for which we have a legacy stream implementations
* Stop producing build logs if we are not looking at them anyway
2021-02-22 01:29:03 +01:00
Andrzej Janik
ab690c6491
Add zluda_redirect.dll to CI builds (#40) 2021-02-21 17:44:42 +01:00
Andrzej Janik
4ed9ef8edb
Improve CI (#39)
* Use official GPU driver packages for building on Linux
* Start building on Windows
* Start uploading artifacts
2021-02-21 14:44:58 +01:00
Andrzej Janik
36514bd6eb
Improve ZLUDA injection (#37)
Improve injector&redirector so it's no longer required to manually mess with files if the application links nvcuda.dll. Additionally inject into child processes
2021-02-20 21:40:19 +01:00
Andrzej Janik
972f612562
Fix signed integer conversion (#36)
This fixes the last remaining bug preventing end-to-end GeekBench run, so also update Geekbench results in README
2021-01-26 21:05:09 +01:00
Andrzej Janik
3e2e73ac33 Add script for replaying dumped kernel (#34)
zluda_dump can already create traces of GPU execution, this script can replay those traces.
Additionally, changed added just enough code in core ZLUDA to support simple PyCUDAexecution
2021-01-23 16:57:07 +01:00
Andrzej Janik
ff8135e8a3
Add a library for dumping kernels arguments before and after launch (#18) 2021-01-16 22:28:48 +01:00
Andrzej Janik
09f679693b
Prevent linker from stripping exports on Linux (#33) 2021-01-15 01:17:44 +01:00
Andrzej Janik
5cd9a5fbc4
Add empty implementation of cuDeviceGetLuid (#30)
This function is required by recent versions of CUDA runtime on Windows
2021-01-08 19:43:46 +01:00
Andrzej Janik
237a6c113a
Regenerate SPIR-V tests (#29)
In one of the previous commits we made a change to mark ld/st as aligned. This change was not propagated to test files
2021-01-08 19:06:11 +01:00
Andrzej Janik
078ae20c2c
Improve build procedure and instructions (#28)
Fixes issues pointed out in #27:
* spirv_tools-sys was build in non-test profiles
* By default ZLUDA dll has a wrong name
* We relied on third-party OpenCL installation on Windows
* We encouraged building debug configuration
* We didn't provide build information for developers (cmake, python, submodules)
2021-01-08 17:17:46 +01:00
Andrzej Janik
2c0e9b912f
Fix Windows ZLUDA injector (#26)
Fix various bugs in injector and redirector, make them more robust and enable building them by default
2021-01-03 18:45:48 +01:00
Andrzej Janik
659b2c6ec4 Merge commit '4b96dbc8f49c5ae00c96935e0b576df88a5d8af9' 2021-01-03 17:54:01 +01:00
Andrzej Janik
4b96dbc8f4 Squashed 'ext/detours/' changes from 39aa864..36b69b9
36b69b9 Make Detours MinGW Clang-compatible

git-subtree-dir: ext/detours
git-subtree-split: 36b69b971888b2ca0c5913563bae011efaa4a42e
2021-01-03 17:54:01 +01:00
Andrzej Janik
77523940b3 Merge commit 'dabc40cb19bf4e297c32284d26c74adbd6775e49' as 'ext/detours' 2021-01-03 17:52:14 +01:00
Andrzej Janik
dabc40cb19 Squashed 'ext/detours/' content from commit 39aa864
git-subtree-dir: ext/detours
git-subtree-split: 39aa864d2985099c8d847e29a5fb86618039b9c4
2021-01-03 17:52:14 +01:00
Takeshi Watanabe
ae950163cd
Add building only CI (#25)
Testing isn't working yet because some tests require live Intel GPU and live NVIDIA GPU
2020-12-29 22:54:48 +01:00
Andrzej Janik
63af70a01f
Fix builtins generation, mark ld/st as aligned (#22)
Two changes:
* Fixes to builtins generation that I forgot to include in #21
* Marking of ld/st as aligned - this gives a big performance boost in GeekBench SFFT
2020-12-12 20:40:24 +01:00
Andrzej Janik
a3cfa24593
Fix SPIR-V code generation for PTX special registers (#21)
We currently directly map PTX special registers: %ntid, %tid, etc. to SPIR-V builtins with type OpTypeVector %uint 4.
This is wrong and leads to a silent corruption, which fails e.g. Depth of Field in GeekBench
2020-12-11 21:31:08 +01:00