Commit graph

344 commits

Author SHA1 Message Date
Andrzej Janik
a3cfa24593
Fix SPIR-V code generation for PTX special registers (#21)
We currently directly map PTX special registers: %ntid, %tid, etc. to SPIR-V builtins with type OpTypeVector %uint 4.
This is wrong and leads to a silent corruption, which fails e.g. Depth of Field in GeekBench
2020-12-11 21:31:08 +01:00
vosen
770a379452
Refactor how vectors are handled (#20)
Current code has a problem with handling vector members: "b.x" in "mov.u32 a, b.x". This functionality has been kinda tacked-on and has annoying issues:
* vector members support is only limited to being source of movs (so "add.u32 a.x, b.x, c.y" will not work)
* the width of "b" in "b.x" is not known, which led to some "interesting" workarounds
* passes can either convert all member accesses to other member accesses or to temporaries. No way to convert some member accesses to temporaries (which we need for an important fix)
This commit solves all this
2020-12-09 00:20:06 +01:00
vosen
a6a9eb347b
Merge pull request #15 from nilsmartel/patch-2
Fix small typo
2020-11-29 00:36:05 +01:00
vosen
295a70e1cb
Merge pull request #14 from ritschwumm/patch-1
fix typo in readme
2020-11-29 00:35:44 +01:00
Nils Martel
f452550c4f
Fix small typo 2020-11-27 14:26:27 +01:00
ritschwumm
b11ba3d1f3
fix typo in readme 2020-11-27 07:24:51 +01:00
Andrzej Janik
103881f70a Update wording, add license 2020-11-24 23:23:53 +01:00
Andrzej Janik
892e47a653 Update README with links to GeekBench results 2020-11-23 22:38:12 +01:00
Andrzej Janik
690f4f3ad2 Append short project name to the device if there's not enough space for long name 2020-11-23 22:24:35 +01:00
Andrzej Janik
8fa044004f Change wording slightly 2020-11-23 22:18:30 +01:00
Andrzej Janik
25fc385b8d Add graph with Geekbench results 2020-11-23 22:15:59 +01:00
Andrzej Janik
bcd1740ba9 Add README and rebuild .spv library 2020-11-23 21:50:21 +01:00
Andrzej Janik
db491dadf2 Remove temporary file 2020-11-23 20:02:47 +01:00
Andrzej Janik
eb7c9aeeee Rename everything 2020-11-23 20:01:10 +01:00
Andrzej Janik
0415f873ae Throw away useless stuff 2020-11-23 20:00:57 +01:00
Andrzej Janik
cd141590be Fix typo in selp 2020-11-22 21:50:54 +01:00
Andrzej Janik
2e8e55738c Add 8bit memset 2020-11-22 18:42:34 +01:00
Andrzej Janik
6e39c4a90c Fix linking with shl/shr, add memset on host and support __assertfail 2020-11-21 01:53:07 +01:00
Andrzej Janik
84ac086146 Fix problems with linking 2020-11-21 00:27:37 +01:00
Andrzej Janik
70dc298381 Fix buggy handling of u8 shared memory 2020-11-20 00:07:50 +01:00
Andrzej Janik
f77b653d36 Implement stateless-to-stateful optimization 2020-11-19 22:12:12 +01:00
Andrzej Janik
eac5fbd806 Support more property queries 2020-11-14 15:48:05 +01:00
Andrzej Janik
a6765baa3a Add back erroneously removed functionality 2020-11-12 22:47:14 +01:00
Andrzej Janik
a2e77fe961 Refactor host code to use one big lock 2020-11-12 20:12:14 +01:00
Andrzej Janik
7c93997cc9 Append project URL to device name and add few missing CUDA v1 functions 2020-11-07 18:08:09 +01:00
Andrzej Janik
62d14cdffe Fix ftz behavior slightly 2020-11-07 16:14:37 +01:00
Andrzej Janik
ac6265f257 Implement instructions bfe, rem, xor 2020-11-06 00:56:45 +01:00
Andrzej Janik
d7bf1acf84 Implement instructions clz, brev, popc 2020-11-05 22:10:06 +01:00
Andrzej Janik
8e409254b3 Fix same width float-to-float conversions 2020-11-05 21:39:34 +01:00
Andrzej Janik
96702d86c9 Fix issues with .param/.local and implement sin, cos, ex2, lg2 2020-11-05 00:27:46 +01:00
Andrzej Janik
e5a53ed5d3 Implement neg instruction 2020-11-01 14:58:44 +01:00
Andrzej Janik
b7d61baf37 Implement div, sqrt, rsqrt and more of setp 2020-11-01 14:34:03 +01:00
Andrzej Janik
a82eb20817 Implement atomic instructions 2020-10-31 21:28:15 +01:00
Andrzej Janik
861116f223 Add support for fma instruction 2020-10-26 23:46:28 +01:00
Andrzej Janik
c8dadca7d2 Implement selp instruction 2020-10-26 19:18:23 +01:00
Andrzej Janik
fc7cc00f47 Add support for and instruction 2020-10-26 18:45:28 +01:00
Andrzej Janik
40bdb83e6b Support float constants 2020-10-26 01:49:25 +01:00
Andrzej Janik
17b788f2a7 Implement ftz handling through Intel extension 2020-10-25 21:09:16 +01:00
Andrzej Janik
45f5183370 Implement ftz handling through Khronos extensions 2020-10-25 19:29:28 +01:00
Andrzej Janik
6480cccc4f Implement rcp instruction 2020-10-25 11:21:51 +01:00
Andrzej Janik
eb9053a42f Add test for indirect shared mem use 2020-10-25 10:34:09 +01:00
Andrzej Janik
85ee8210df Add dynamic shared mem support 2020-10-25 00:24:40 +02:00
Andrzej Janik
28a0968294 Fix small regression 2020-10-18 15:06:37 +02:00
Andrzej Janik
2b3ecc99e3 Implement pass to handle .extern .shared and add parsing code for it 2020-10-18 14:46:05 +02:00
Andrzej Janik
27d25865af Add support for top-level global variables, improve array support 2020-10-04 19:53:07 +02:00
Andrzej Janik
9a65dd32f5 Add sub, min, max 2020-10-02 00:11:28 +02:00
Andrzej Janik
bd3d440dba Implement or 2020-10-01 20:28:57 +02:00
Andrzej Janik
96a342e33f Implement shr 2020-10-01 18:13:09 +02:00
Andrzej Janik
3e92921275 Fix remaining bugs in vector destructuring and in the process improve implicit conversions 2020-10-01 18:11:57 +02:00
Andrzej Janik
1e0b35be4b Implement vector-destructuring mov/ld/st 2020-09-30 19:27:29 +02:00