Skip to content

Conversation

@hiro-v
Copy link
Contributor

@hiro-v hiro-v commented Dec 12, 2023

Windows has serious problem with only CPU and also it cannot use NVIDIA GPU properly

See comments for the test result

@hiro-v hiro-v added this to the Jan on Windows milestone Dec 12, 2023
@hiro-v hiro-v requested review from hiento09 and tikikun December 12, 2023 03:56
@hiro-v hiro-v self-assigned this Dec 12, 2023
@hiro-v hiro-v force-pushed the fix/windows_cuda_nitro branch from 133de60 to 22c8e18 Compare December 12, 2023 07:01
@hiro-v hiro-v force-pushed the fix/windows_cuda_nitro branch from 22c8e18 to 2d78857 Compare December 12, 2023 12:47
@hiro-v
Copy link
Contributor Author

hiro-v commented Dec 12, 2023

  • CPU:
    CleanShot_2023-12-12_at_23 47 26
  • Jan app usage:
    CleanShot_2023-12-12_at_23 47 43

Machine:

  • OS: Windows 11 Home

  • No AMD/ NVIDIA GPU

  • CPU: Intel 8th i7

  • Memory: 16GB

  • Model: TinyLlama, but other model yields the same performance

  • Performance: 15 tokens/s (very reasonable)

  • It's not laggy as it uses Physical cores (not logical cores which normally on windows they supports hyper-threading which is 2 per CPU, and on Intel 12th+ it has performance/ efficiency cores which is not easy to calculate). This setup helps a lot for normal computer as compute intensive tasks (LLM model) are scheduled to run on performance cores, while other IO intensive tasks (even OS background tasks) on others
    The CPU does not likely to hit 100% and hold at that level (better from what we experienced with current stable release on windows)

However there are some Windows specific problem that we observe:
After windows sleep and wakeup, the perf decrease by apprx. 20%
Windows without power plugged in has perf decrease but I did not have chance to test.

@hiro-v
Copy link
Contributor Author

hiro-v commented Dec 12, 2023

  • CPU:
    CleanShot 2023-12-13 at 00 28 40
  • NVIDIA GPU VRAM usage with Nitro log
    CleanShot 2023-12-13 at 00 29 42
  • GPU utilization and VRAM consumption
    CleanShot 2023-12-13 at 00 30 37
  • Jan App performance
    CleanShot 2023-12-13 at 00 29 58

Machine:

  • OS: Windows 11 Home
  • NVIDIA GPU 3090 with 24GB VRAM
  • CPU: Intel 13th i9
  • Memory: 64GB

Test:

  • Model: Tinyllama
  • Nitro version: 0.1.26
  • Performance (impressive): 57token/s (almost fully offloaded to NVIDIA GPU)

@hiro-v hiro-v requested review from a team and removed request for hiento09 and tikikun December 12, 2023 17:39
@hiro-v
Copy link
Contributor Author

hiro-v commented Dec 12, 2023

Test result on the laptop without NVIDIA, now I plug the NVIDIA GTX 1050ti 4gb RAM.
The best thing is that I close Jan app and open again, it can use NVIDIA GPU right away without any further installation, which is good UX.

Here is the test result:

  • Jan app result (22 tokens/s)
    CleanShot 2023-12-13 at 00 58 08

  • CPU usage
    CleanShot 2023-12-13 at 00 57 14

  • GPU usage
    CleanShot 2023-12-13 at 00 57 24

  • Nitro NVIDIA utilization
    CleanShot 2023-12-13 at 00 57 35

  • Model: TinyLlama

  • Perf: 22 tokens/ s

@hiro-v hiro-v merged commit a7099a4 into main Dec 13, 2023
@hiro-v hiro-v deleted the fix/windows_cuda_nitro branch December 13, 2023 04:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

No open projects
Archived in project

Development

Successfully merging this pull request may close these issues.

3 participants