Skip to content

Conversation

@rschwietzke
Copy link
Contributor

These changes won't change the runtime but the resource need, aka freeing things up. Less branches, less instructions, less cycles, less cache misses.

Old Performance Data

        660,414.10 msec task-clock                       #    7.386 CPUs utilized             
           261,219      context-switches                 #  395.538 /sec                      
            27,304      cpu-migrations                   #   41.344 /sec                      
         1,559,640      page-faults                      #    2.362 K/sec                     
 2,155,072,417,633      cycles                           #    3.263 GHz                       
   394,882,775,879      stalled-cycles-frontend          #   18.32% frontend cycles idle      
 4,062,894,368,682      instructions                     #    1.89  insn per cycle            
                                                  #    0.10  stalled cycles per insn   
   751,461,828,221      branches                         #    1.138 G/sec                     
    14,439,955,921      branch-misses                    #    1.92% of all branches           

      89.417514945 seconds time elapsed

     593.675541000 seconds user
      62.852506000 seconds sys

New Performance Data

        645,923.27 msec task-clock                       #    7.331 CPUs utilized             
           274,570      context-switches                 #  425.081 /sec                      
            29,801      cpu-migrations                   #   46.137 /sec                      
         1,582,803      page-faults                      #    2.450 K/sec                     
 2,096,071,549,319      cycles                           #    3.245 GHz                       
   372,558,046,965      stalled-cycles-frontend          #   17.77% frontend cycles idle      
 3,883,824,436,885      instructions                     #    1.85  insn per cycle            
                                                  #    0.10  stalled cycles per insn   
   707,559,172,285      branches                         #    1.095 G/sec                     
    14,447,594,308      branch-misses                    #    2.04% of all branches           

      88.107659677 seconds time elapsed

     576.587159000 seconds user
      64.967407000 seconds sys

@rschwietzke rschwietzke added performance java Pull requests that update Java code PRIORITY: NORMAL labels Sep 26, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements several performance optimizations aimed at reducing CPU cycles, instructions, branches, and cache misses without changing the runtime behavior. The changes focus on optimizing hot code paths in data processing and hash computation.

Key changes:

  • Replace reflection-based object creation with direct instantiation using switch expressions for common data types
  • Optimize hash computation algorithms to reduce operations and improve cache efficiency
  • Remove unnecessary hash computation calls during object initialization

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
DataRecordFactory.java Replaces reflection with switch-based direct instantiation for common data record types
XltCharBuffer.java Refactors hash computation to reduce operations and improve performance
AbstractData.java Removes premature hash computation during object initialization
StringHasher.java Adds optimized hash method for XltCharBuffer using native String operations

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

for (; i < l2; i++)
{
h = 31 * h + src[i];
h = (src[i] - h) + (h << 5);
Copy link

Copilot AI Sep 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hash computation formula (src[i] - h) + (h << 5) differs significantly from the standard multiplication-based approach used in the main loop. This inconsistency makes the algorithm harder to understand and may produce different hash distribution characteristics. Consider using a consistent approach like h = 31 * h + src[i] or document why this specific formula was chosen.

Suggested change
h = (src[i] - h) + (h << 5);
h = 31 * h + src[i];

Copilot uses AI. Check for mistakes.
Comment on lines +71 to +80
String _s = s.toString();
final int pos = _s.indexOf(limitingChar);
if (pos > 0)
{
return _s.substring(0, pos).hashCode();
}
else
{
return _s.hashCode();
}
Copy link

Copilot AI Sep 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition pos > 0 will skip hashing when the limiting character is at position 0, returning the full string's hash instead of an empty string's hash. This should be pos >= 0 to handle the case where the limiting character is the first character correctly.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

java Pull requests that update Java code performance PRIORITY: NORMAL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants