Andrzej Janik
9dcfb45aa2
Make dumper 32-bit compatible
2021-04-09 21:34:41 +02:00
Andrzej Janik
94af72f46b
Fix 32-bit builds
2021-04-09 20:32:37 +02:00
Andrzej Janik
15f465041d
Implement setp.nan and setp.num
2021-03-03 23:35:18 +01:00
Andrzej Janik
17291019e3
Implement atomic float add
2021-03-03 22:41:47 +01:00
Andrzej Janik
efd91e270c
Implement non-coherent loads and implicit sign-extending conversions
2021-03-03 21:22:31 +01:00
Andrzej Janik
cdac38d572
Support kernel tuning directives
2021-03-03 00:59:47 +01:00
Andrzej Janik
648035a01a
Update rspirv/spirv_headers to the newest version
2021-03-02 01:42:23 +01:00
Andrzej Janik
178ec59af6
Implement bfi instruction
2021-03-01 23:01:53 +01:00
Andrzej Janik
d3cd2dc8b4
Do slightly better when it comes to PTX error recovery
2021-03-01 02:24:27 +01:00
Andrzej Janik
eec55d9d02
Inform about ELF binaries in dumper
2021-02-28 12:49:25 +01:00
Andrzej Janik
06a5cff2d8
Add our nvml to the build
2021-02-28 02:11:22 +01:00
Andrzej Janik
088ff760de
Tell linguist to stop counting third-party code
2021-02-28 02:08:22 +01:00
Andrzej Janik
ba83bb28f7
Inject our own NVML
2021-02-28 01:50:04 +01:00
Andrzej Janik
b7ee6d66c3
Implement enough nvml to make GeekBench happy
2021-02-28 00:46:50 +01:00
Andrzej Janik
871b8d1bef
Update level_zero-sys with the newest extension
2021-02-27 21:24:01 +01:00
Andrzej Janik
bfae2e0d21
Allow overriding device compute version in dumper
2021-02-27 20:55:19 +01:00
Andrzej Janik
4d3e37befc
Update README.md ( #42 )
2021-02-22 01:32:04 +01:00
Andrzej Janik
a906c350f2
Make misc fixes ( #41 )
...
* Update ze_loader.lib to the newest version
* Export _ptsz/_ptds for which we have a legacy stream implementations
* Stop producing build logs if we are not looking at them anyway
2021-02-22 01:29:03 +01:00
Andrzej Janik
ab690c6491
Add zluda_redirect.dll to CI builds ( #40 )
2021-02-21 17:44:42 +01:00
Andrzej Janik
4ed9ef8edb
Improve CI ( #39 )
...
* Use official GPU driver packages for building on Linux
* Start building on Windows
* Start uploading artifacts
2021-02-21 14:44:58 +01:00
Andrzej Janik
36514bd6eb
Improve ZLUDA injection ( #37 )
...
Improve injector&redirector so it's no longer required to manually mess with files if the application links nvcuda.dll. Additionally inject into child processes
2021-02-20 21:40:19 +01:00
Andrzej Janik
972f612562
Fix signed integer conversion ( #36 )
...
This fixes the last remaining bug preventing end-to-end GeekBench run, so also update Geekbench results in README
2021-01-26 21:05:09 +01:00
Andrzej Janik
3e2e73ac33
Add script for replaying dumped kernel ( #34 )
...
zluda_dump can already create traces of GPU execution, this script can replay those traces.
Additionally, changed added just enough code in core ZLUDA to support simple PyCUDAexecution
2021-01-23 16:57:07 +01:00
Andrzej Janik
ff8135e8a3
Add a library for dumping kernels arguments before and after launch ( #18 )
2021-01-16 22:28:48 +01:00
Andrzej Janik
09f679693b
Prevent linker from stripping exports on Linux ( #33 )
2021-01-15 01:17:44 +01:00
Andrzej Janik
5cd9a5fbc4
Add empty implementation of cuDeviceGetLuid ( #30 )
...
This function is required by recent versions of CUDA runtime on Windows
2021-01-08 19:43:46 +01:00
Andrzej Janik
237a6c113a
Regenerate SPIR-V tests ( #29 )
...
In one of the previous commits we made a change to mark ld/st as aligned. This change was not propagated to test files
2021-01-08 19:06:11 +01:00
Andrzej Janik
078ae20c2c
Improve build procedure and instructions ( #28 )
...
Fixes issues pointed out in #27 :
* spirv_tools-sys was build in non-test profiles
* By default ZLUDA dll has a wrong name
* We relied on third-party OpenCL installation on Windows
* We encouraged building debug configuration
* We didn't provide build information for developers (cmake, python, submodules)
2021-01-08 17:17:46 +01:00
Andrzej Janik
2c0e9b912f
Fix Windows ZLUDA injector ( #26 )
...
Fix various bugs in injector and redirector, make them more robust and enable building them by default
2021-01-03 18:45:48 +01:00
Andrzej Janik
659b2c6ec4
Merge commit '4b96dbc8f49c5ae00c96935e0b576df88a5d8af9'
2021-01-03 17:54:01 +01:00
Andrzej Janik
4b96dbc8f4
Squashed 'ext/detours/' changes from 39aa864..36b69b9
...
36b69b9 Make Detours MinGW Clang-compatible
git-subtree-dir: ext/detours
git-subtree-split: 36b69b971888b2ca0c5913563bae011efaa4a42e
2021-01-03 17:54:01 +01:00
Andrzej Janik
77523940b3
Merge commit 'dabc40cb19bf4e297c32284d26c74adbd6775e49' as 'ext/detours'
2021-01-03 17:52:14 +01:00
Andrzej Janik
dabc40cb19
Squashed 'ext/detours/' content from commit 39aa864
...
git-subtree-dir: ext/detours
git-subtree-split: 39aa864d2985099c8d847e29a5fb86618039b9c4
2021-01-03 17:52:14 +01:00
Takeshi Watanabe
ae950163cd
Add building only CI ( #25 )
...
Testing isn't working yet because some tests require live Intel GPU and live NVIDIA GPU
2020-12-29 22:54:48 +01:00
Andrzej Janik
63af70a01f
Fix builtins generation, mark ld/st as aligned ( #22 )
...
Two changes:
* Fixes to builtins generation that I forgot to include in #21
* Marking of ld/st as aligned - this gives a big performance boost in GeekBench SFFT
2020-12-12 20:40:24 +01:00
Andrzej Janik
a3cfa24593
Fix SPIR-V code generation for PTX special registers ( #21 )
...
We currently directly map PTX special registers: %ntid, %tid, etc. to SPIR-V builtins with type OpTypeVector %uint 4.
This is wrong and leads to a silent corruption, which fails e.g. Depth of Field in GeekBench
2020-12-11 21:31:08 +01:00
vosen
770a379452
Refactor how vectors are handled ( #20 )
...
Current code has a problem with handling vector members: "b.x" in "mov.u32 a, b.x". This functionality has been kinda tacked-on and has annoying issues:
* vector members support is only limited to being source of movs (so "add.u32 a.x, b.x, c.y" will not work)
* the width of "b" in "b.x" is not known, which led to some "interesting" workarounds
* passes can either convert all member accesses to other member accesses or to temporaries. No way to convert some member accesses to temporaries (which we need for an important fix)
This commit solves all this
2020-12-09 00:20:06 +01:00
vosen
a6a9eb347b
Merge pull request #15 from nilsmartel/patch-2
...
Fix small typo
2020-11-29 00:36:05 +01:00
vosen
295a70e1cb
Merge pull request #14 from ritschwumm/patch-1
...
fix typo in readme
2020-11-29 00:35:44 +01:00
Nils Martel
f452550c4f
Fix small typo
2020-11-27 14:26:27 +01:00
ritschwumm
b11ba3d1f3
fix typo in readme
2020-11-27 07:24:51 +01:00
Andrzej Janik
103881f70a
Update wording, add license
2020-11-24 23:23:53 +01:00
Andrzej Janik
892e47a653
Update README with links to GeekBench results
2020-11-23 22:38:12 +01:00
Andrzej Janik
690f4f3ad2
Append short project name to the device if there's not enough space for long name
2020-11-23 22:24:35 +01:00
Andrzej Janik
8fa044004f
Change wording slightly
2020-11-23 22:18:30 +01:00
Andrzej Janik
25fc385b8d
Add graph with Geekbench results
2020-11-23 22:15:59 +01:00
Andrzej Janik
bcd1740ba9
Add README and rebuild .spv library
2020-11-23 21:50:21 +01:00
Andrzej Janik
db491dadf2
Remove temporary file
2020-11-23 20:02:47 +01:00
Andrzej Janik
eb7c9aeeee
Rename everything
2020-11-23 20:01:10 +01:00
Andrzej Janik
0415f873ae
Throw away useless stuff
2020-11-23 20:00:57 +01:00