Skip to content

3528: AVRO-4194: remove PHP EOL versions and use new language features#17

Open
martin-augment wants to merge 16 commits intomainfrom
pr-3528-2025-11-05-06-44-15
Open

3528: AVRO-4194: remove PHP EOL versions and use new language features#17
martin-augment wants to merge 16 commits intomainfrom
pr-3528-2025-11-05-06-44-15

Conversation

@martin-augment
Copy link
Copy Markdown
Owner

@martin-augment martin-augment commented Nov 5, 2025

3528: To review by AI


Note

Raise PHP requirement to 8.1+, refactor the PHP library with strict typing and interfaces, add lint/static analysis, and update CI/tooling and tests accordingly.

  • Breaking/Platform:
    • Require php:^8.1; remove 7.3/7.4/8.0 from CI matrices and README.
  • Core PHP library:
    • Apply strict typing and PHP 8 features across codebase (typed properties/returns, declare(strict_types=1)).
    • Convert AvroIO to an interface; update AvroFile/AvroStringIO and callers; adjust method signatures.
    • Refactor protocol/schema handling (new AvroAliasedSchema, improved name/alias handling, logical types support, safer parsing with exceptions).
    • Improve DataFile reader/writer, encoders/decoders, and duration/decimal handling; add more precise exceptions.
  • Tooling/Build:
    • Add .php-cs-fixer.dist.php; introduce lint/lint-fix and phpstan targets in build.sh.
    • Update composer.json dev deps (phpunit 10, php-cs-fixer, phpstan, rector); add phpstan.neon; update phpunit.xml for v10.
    • Extend .gitignore for caches.
  • CI:
    • Test PHP 8.1–8.4; run lint (8.1), phpstan, and tests; use Composer 2.8.12.
    • Interop job installs zstd/snappy extensions and generates/validates interop data; sets up Maven/Java.
  • Docs/Examples/Tests:
    • Update README requirements; modernize examples (short arrays).
    • Migrate tests to PHP 8 attributes, strict types, and updated assertions.
  • Docker:
    • Bump Composer to 2.8.12 in build image.

Written by Cursor Bugbot for commit d650ca1. This will update automatically on new commits. Configure here.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Nov 5, 2025

Walkthrough

This PR modernizes the PHP Avro implementation by requiring PHP 8.1+, adding static analysis tooling (PHPStan, PHP-CS-Fixer, Rector), implementing strict typed properties and return types throughout the codebase, converting AvroIO from a class to an interface, and migrating tests to use PHP 8 attributes for data providers.

Changes

Cohort / File(s) Summary
CI/CD & Dependencies
.github/workflows/test-lang-php.yml, composer.json, phpunit.xml, phpstan.neon, share/docker/Dockerfile
Updated PHP matrix to 8.1+ with 8.4 support; added PHPStan and PHP-CS-Fixer dev dependencies; updated PHPUnit to ^10.5; bumped Composer installer to 2.8.12; added PHPUnit configuration updates for modern setup; added PHPStan configuration with analysis level 3.
Build & Configuration
lang/php/.gitignore, lang/php/.php-cs-fixer.dist.php, lang/php/build.sh, lang/php/README.md
Added cache file ignores for PHPUnit and PHP-CS-Fixer; introduced PHP-CS-Fixer configuration; added lint-fix target to build script; updated PHP requirement documentation from 7.3+ to 8.1+.
Core Types
lang/php/lib/Avro.php, lang/php/lib/AvroDebug.php, lang/php/lib/AvroGMP.php
Added strict typing to properties and methods; converted boolean/int parameters to explicit types; updated docblocks with GMP resource type annotations (\GMP instead of resource).
IO Layer
lang/php/lib/AvroIO.php, lang/php/lib/IO/AvroFile.php, lang/php/lib/IO/AvroStringIO.php
Converted AvroIO from concrete class to interface with public constants and strictly-typed method signatures; AvroFile and AvroStringIO now implement AvroIO with constructor promotion and explicit return types; added Stringable implementation to AvroStringIO.
Data File I/O
lang/php/lib/DataFile/AvroDataIO.php, lang/php/lib/DataFile/AvroDataIOReader.php, lang/php/lib/DataFile/AvroDataIOWriter.php
Added strict property typing, constructor property promotion for dependency injection, explicit return and parameter types across all methods; updated codec handling with type narrowing.
Datum Encoding
lang/php/lib/Datum/AvroIOBinaryDecoder.php, lang/php/lib/Datum/AvroIOBinaryEncoder.php
Introduced constructor property promotion, strict parameter/return types (bool, int, string, float, void, mixed unions), narrowed method signatures with schema-specific types.
Datum Reader/Writer
lang/php/lib/Datum/AvroIODatumReader.php, lang/php/lib/Datum/AvroIODatumWriter.php
Implemented constructor property promotion for schema parameters, replaced switch statements with match expressions, added strict parameter and return types, narrowed method signatures to accept schema-specific types.
Datum Types
lang/php/lib/Datum/Type/AvroDuration.php
Implemented Stringable interface; converted properties to private readonly int with explicit typing.
Schema Base & Utilities
lang/php/lib/Schema/AvroSchema.php, lang/php/lib/Schema/AvroName.php, lang/php/lib/Schema/AvroAliasedSchema.php, lang/php/lib/Schema/AvroLogicalType.php, lang/php/lib/Schema/AvroNamedSchemata.php
Implemented Stringable on AvroSchema and AvroName; added strict typing throughout; AvroSchema now uses constructor-promoted readonly property for type; AvroNamedSchemata uses property promotion; new AvroAliasedSchema interface added.
Schema Named Types
lang/php/lib/Schema/AvroNamedSchema.php, lang/php/lib/Schema/AvroRecordSchema.php, lang/php/lib/Schema/AvroEnumSchema.php, lang/php/lib/Schema/AvroFixedSchema.php, lang/php/lib/Schema/AvroUnionSchema.php
Implemented AvroAliasedSchema on AvroNamedSchema; added strict typing to properties and methods; AvroRecordSchema refactored with parseFields signature changes and explicit field construction; AvroFixedSchema removed doc parameter from factory methods; all methods include explicit return types.
Schema Collections
lang/php/lib/Schema/AvroArraySchema.php, lang/php/lib/Schema/AvroMapSchema.php, lang/php/lib/Schema/AvroPrimitiveSchema.php
Added explicit return types (string|array) to toAvro() and values methods; AvroMapSchema enhanced with isValuesSchemaFromSchemata tracking; replaced loose to strict equality comparisons.
Protocol
lang/php/lib/Protocol/AvroProtocol.php, lang/php/lib/Protocol/AvroProtocolMessage.php
Converted to constructor-based design with readonly properties; realParse made static; AvroProtocolMessage now accepts array $avro, namespace, and schemata; exception handling added for JSON parse errors.
Examples
lang/php/examples/write_read.php
Updated array syntax from array() to short [] syntax.
Unit Tests
lang/php/test/DatumIOTest.php, lang/php/test/DataFileTest.php, lang/php/test/FloatIntEncodingTest.php, lang/php/test/IODatumReaderTest.php, lang/php/test/InterOpTest.php, lang/php/test/LongEncodingTest.php, lang/php/test/NameTest.php, lang/php/test/ProtocolFileTest.php, lang/php/test/SchemaTest.php, lang/php/test/StringIOTest.php
Migrated from docblock @dataProvider to PHP 8 DataProvider attributes; added strict_types declaration; introduced constructor property promotion in test helper classes; updated method names to snake_case; added explicit return types (void) to test methods; strict type hints on test parameters.
Test Infrastructure
lang/php/test/test_helper.php, lang/php/test/generate_interop_data.php
Added public constants for directory paths (AVRO_BASE_DIR, AVRO_SHARE_DIR, AVRO_BUILD_DIR, etc.); converted define() to const declaration; minor strict comparison updates in data generation.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch pr-3528-2025-11-05-06-44-15

Comment @coderabbitai help to get the list of available commands and usage tips.

@claude
Copy link
Copy Markdown

claude bot commented Nov 5, 2025

Pull Request Review: PHP 8.1+ Modernization (AVRO-4194)

Overview

This PR modernizes the PHP Avro library by dropping support for EOL PHP versions (7.3, 7.4, 8.0) and adopting PHP 8.1+ language features. Overall, this is a well-structured modernization effort with significant improvements to code quality and maintainability.

✅ Strengths

1. Modern PHP Features Adoption

  • Typed Properties: Excellent use of typed class properties throughout (e.g., private AvroIO $io, private int $block_count)
  • Strict Types: declare(strict_types=1) added consistently across files
  • Constructor Property Promotion: Used effectively in several classes (e.g., AvroDataIOReader, AvroIOBinaryDecoder)
  • Return Type Declarations: Comprehensive addition of return types to all methods
  • Union Types: Proper use of PHP 8.0+ union types (e.g., string|int, null|string|AvroSchema)

2. Architecture Improvements

  • AvroIO Conversion: Converting AvroIO from a class to an interface (lang/php/lib/AvroIO.php:31) is architecturally sound and enforces proper contracts
  • New Interface: AvroAliasedSchema interface added for better abstraction
  • Better Encapsulation: Type hints enforce better API contracts

3. Tooling & Quality Assurance

  • PHPStan: Added static analysis (level 3) with proper configuration
  • PHP-CS-Fixer: Comprehensive code style configuration with parallel processing
  • Updated CI: Removed EOL PHP versions from GitHub Actions workflow
  • Dependency Updates: PHPUnit upgraded to ^10.5 (compatible with PHP 8.1+)

4. Test Coverage

  • All 12 test files have been updated with improved assertions and modern syntax
  • Test files properly refactored: ~400 lines added, ~300 removed across test suite
  • Tests use modern PHPUnit features

⚠️ Issues & Concerns

1. Breaking Changes (CRITICAL)

This PR contains breaking changes that need documentation:

  • PHP Version Requirement: Bumped from ^7.3 || ^8.0 to ^8.1 (composer.json:26)
  • Interface Change: AvroIO changed from class to interface - any code extending AvroIO will break
  • Method Signatures: Strict type hints may break implementations/extensions with different signatures
  • Return Types: Methods now require specific return types - affects all implementers

Recommendation:

  • This should be tagged as a major version bump (semver)
  • Update CHANGES.md with comprehensive breaking changes documentation
  • Consider migration guide for users

2. Potential Bugs

a. Double Cast in AvroDataIO.php:102

return strlen((string) self::magic());

This cast is suspicious - if self::magic() already returns a string, the cast is redundant. If it doesn't, this should be investigated.

b. Empty Return in AvroIOBinaryDecoder.php:161

public static function longBitsToDouble(string $bits)
{
    $double = unpack('e', $bits);
    return (double) $double[1];

The diff was truncated, but ensure this function properly returns a value.

c. Union Type Ordering (lang/php/lib/DataFile/AvroDataIOWriter.php:76)

null|string|AvroSchema $writers_schema = null

PHP convention is to put null last in union types: string|AvroSchema|null. This isn't a bug but affects readability.

3. Documentation Gaps

  • Missing Return Types in DocBlocks: Some methods still have incomplete @returns tags (e.g., missing types)
  • PHPStan Level: Currently at level 3 - consider gradually increasing to level 5+ for stricter analysis
  • README: Updated version requirements but no migration guide mentioned

4. Code Quality Details

a. Inconsistent Array Syntax

While most arrays have been modernized to [], ensure consistency throughout:

$x = []; // Good - modern syntax used throughout

b. GMP Documentation (lang/php/lib/AvroGMP.php:35-47)

PHPDoc comments reference @var resource for GMP objects, but in PHP 8.1+ GMP functions return \GMP objects, not resources. Update to:

/**
 * @var \GMP memoized GMP resource for zero
 */

Status: This appears to have been updated correctly to \GMP - good!

c. Error Handling

No changes to exception handling patterns - verify existing patterns are still appropriate with strict types.

5. Performance Considerations

  • Typed Properties: Slight performance improvement due to optimized property access
  • Strict Types: May have minor performance benefit from eliminating type juggling
  • No Apparent Regressions: The changes are primarily type-related, unlikely to affect runtime performance

6. Security

  • ✅ No obvious security concerns introduced
  • ✅ Strict typing improves type safety and reduces certain vulnerability classes
  • ✅ No changes to serialization/deserialization logic that would affect security

📋 Recommendations

High Priority

  1. Document Breaking Changes: Create comprehensive migration guide
  2. Version Bump: Use proper semver (major version bump)
  3. Verify Cast Operations: Check the (string) cast in AvroDataIO.php:102
  4. CI Testing: Ensure all tests pass on PHP 8.1, 8.2, 8.3, and 8.4

Medium Priority

  1. PHPStan Level: Consider increasing from level 3 to level 4-5 incrementally
  2. Union Type Ordering: Standardize to put null last in union types
  3. Add PHP 8.4 Support: Consider adding PHP 8.4 to the CI matrix if ready

Low Priority

  1. Code Coverage: Add code coverage reporting to CI
  2. Rector Rules: Document or add Rector configuration for automated refactoring consistency
  3. Deprecation Strategy: For next breaking changes, consider using @deprecated tags first

🎯 Test Coverage Assessment

  • 12 test files updated with modernized syntax
  • ✅ Tests include: DataFile, DatumIO, Encoding (Float/Long), Interop, Name, Protocol, Schema, StringIO
  • ✅ Test structure preserved while updating assertions
  • ⚠️ No new tests added for PHP 8.1+ specific features - consider adding tests for strict type behavior

🔒 Security Assessment

  • ✅ No security vulnerabilities introduced
  • ✅ Type safety improvements reduce bug surface area
  • ✅ No changes to authentication, authorization, or data validation logic

Summary

This is a high-quality modernization PR that brings the codebase up to modern PHP standards. The use of typed properties, strict types, and modern tooling significantly improves code quality. However, this introduces breaking changes that require:

  1. Major version bump
  2. Comprehensive documentation
  3. Migration guide for users

Recommendation: Approve with minor revisions (address breaking change documentation and verify the casting issues noted above).


Files Changed: 52 files (+2,768 lines, -2,396 lines)
Scope: PHP version upgrade, type safety, tooling modernization
Risk Level: Medium (breaking changes, but well-tested)

$aliases
name: $new_name,
schemata: $schemata,
aliases: $aliases
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Silent doc loss in Avro duration schemas

The AvroFixedSchema::duration() method doesn't accept a doc parameter, but $doc is extracted from the schema at line 382. In the old code, $doc was passed to the duration factory method but was being ignored. While the new code correctly doesn't pass $doc, the extracted $doc value (line 382) is never used for duration schemas, which means documentation is silently discarded for duration logical types. The AvroFixedSchema constructor (called inside duration()) passes null for the doc parameter to parent AvroNamedSchema, discarding any doc attribute from the schema definition.

Fix in Cursor Fix in Web

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value:useful; category:bug; feedback:The Bugbot AI agent is correct that the changes in the PR make the doc (documentation) property of Fixed schemas even more unused. Before it was passed to the constructor but not used. Now it is not even passed. Technically, there is no regression but it would have been better to make use of it instead of dropping it completely.

Copy link
Copy Markdown

@augmentcode augmentcode bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 1 suggestions posted.

}

$value = $decimal * (10 ** $scale);
$value = ((float) $decimal) * (10 ** $scale);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Casting $decimal to float before scaling can lose precision and lead to incorrect decimal encoding for larger magnitudes or scales; consider avoiding float conversion and using an integer/arbitrary-precision approach to compute the unscaled value (also applies to other locations in the PR if similar).

🤖 React with 👍 or 👎 to let us know if the comment was useful.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value:useful; category:bug; feedback:The Augment AI reviewer is correct that the casting here looks wrong. It prevents precision loss. It would be better to use a special decimal type/library that preserves the precision.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Nov 5, 2025

  • lang/php/lib/Datum/AvroIODatumReader.php:225 — The rewritten attributesMatch() now returns true even when the fully-qualified names differ and the reader schema has no aliases (or none that match), because the unconditional return false; after the alias loop was removed. That lets schemasMatch() treat incompatible named types (records/enums/fixed) as compatible, breaking schema resolution and causing bad decodes instead of throwing. Please restore the fallback return false when no alias matches, and only return false inside the loop after checking all aliases (i.e. keep the old “return true on first matching alias, otherwise fail” behaviour).

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (9)
lang/php/lib/Schema/AvroName.php (1)

157-168: Add explicit return types for consistency.

Methods fullname() and qualifiedName() lack explicit return type declarations while other methods in this class (e.g., namespace() at line 170) now have them. For consistency with the modernization effort, these should declare : string return types.

Apply this diff to add explicit return types:

-    public function fullname()
+    public function fullname(): string
     {
         return $this->fullname;
     }
 
-    public function qualifiedName()
+    public function qualifiedName(): string
     {
         return $this->qualified_name;
     }
lang/php/lib/Schema/AvroMapSchema.php (1)

41-63: Verify null-safe access to $schemata parameter.

The constructor calls $schemata->schemaByName() on Line 49 without first checking if $schemata is null. Since the parameter is nullable (?AvroNamedSchemata &$schemata = null), this will cause a null pointer error if $schemata is not provided.

Apply this diff to add a null check:

         $this->isValuesSchemaFromSchemata = false;
         $values_schema = null;
         if (
             is_string($values)
+            && $schemata !== null
             && $values_schema = $schemata->schemaByName(
                 new AvroName($values, null, $defaultNamespace)
             )
lang/php/lib/IO/AvroStringIO.php (2)

126-139: Restore interface-compatible read signature

Line 126 drops the int type hint on $len, so AvroStringIO::read() no longer matches AvroIO::read(int $len): string. In PHP 8.1 this triggers a fatal “must be compatible” error when the class is loaded. Please keep the integer parameter type.

-    public function read($len): string
+    public function read(int $len): string

169-172: Fix SEEK_CUR bounds check

Line 169 uses $this->current_index + $whence when guarding against underflow, but $whence is the mode constant (0/1/2), not the displacement. With a negative $offset the guard never fires, letting $current_index fall below zero and leading to string-offset errors later. Compare against $offset instead.

-                if (0 > $this->current_index + $whence) {
+                if (0 > $this->current_index + $offset) {
lang/php/lib/Schema/AvroUnionSchema.php (1)

50-83: Initialise $this->schemas before use

Lines 80-83 append to $this->schemas, but the typed property is never initialised. In PHP 8.1 a statement like $this->schemas[] = … on an uninitialised typed property throws “must not be accessed before initialization”, so constructing unions now blows up. Seed the property with an empty array before the loop.

-        $this->schemaFromSchemataIndices = [];
-        $schema_types = [];
+        $this->schemaFromSchemataIndices = [];
+        $this->schemas = [];
+        $schema_types = [];
lang/php/lib/Schema/AvroField.php (1)

80-117: Initialise $default to preserve old semantics

$default is only assigned when $hasDefault is true. For fields without defaults, calling defaultValue() (or other access) now throws “Typed property AvroField::$default must not be accessed before initialization”. Previously this quietly returned null, so this is a new runtime break.

-    private mixed $default;
+    private mixed $default = null;
lang/php/lib/DataFile/AvroDataIOWriter.php (1)

100-106: Default codec must be enforced when reading existing files

When we open an existing data file (no writer schema passed in), older Avro containers legitimately omit the avro.codec metadata key and rely on the implicit "null" codec. Accessing that key now returns null, and because $this->codec is a string-typed property, PHP 8.1+ throws a TypeError, breaking reads of every legacy file. We need to fall back to AvroDataIO::NULL_CODEC before assigning and keep the metadata consistent.

-            $this->metadata[AvroDataIO::METADATA_CODEC_ATTR] = $this->codec
-                = $dfr->metadata[AvroDataIO::METADATA_CODEC_ATTR];
+            $codecFromFile = $dfr->metadata[AvroDataIO::METADATA_CODEC_ATTR] ?? AvroDataIO::NULL_CODEC;
+            $this->metadata[AvroDataIO::METADATA_CODEC_ATTR] = $this->codec = $codecFromFile;
lang/php/lib/DataFile/AvroDataIOReader.php (1)

75-83: Restore null-codec default to avoid new TypeError

$this->metadata can legitimately lack avro.codec (older files) which used to mean “null” codec. After the refactor $codec becomes null, and assigning it to the string-typed $this->codec immediately raises a TypeError, so every legacy container fails to open. Please restore the implicit default before validation.

-        $codec = $this->metadata[AvroDataIO::METADATA_CODEC_ATTR] ?? null;
-        if ($codec && !AvroDataIO::isValidCodec($codec)) {
+        $codec = $this->metadata[AvroDataIO::METADATA_CODEC_ATTR] ?? AvroDataIO::NULL_CODEC;
+        if (!AvroDataIO::isValidCodec($codec)) {
             throw new AvroDataIOException(sprintf('Unknown codec: %s', $codec));
         }
         $this->codec = $codec;
lang/php/lib/Schema/AvroRecordSchema.php (1)

94-109: Guard $schemata before dereferencing

AvroRecordSchema::__construct still defaults $schemata to null, and external callers rely on that. The new code unconditionally calls $schemata->schemaByName(...), so new AvroRecordSchema($name, $doc, $fields) now fatals with “Call to a member function schemaByName() on null.” Please reintroduce the null guard before dereferencing.

-            if (
-                is_string($type)
-                && $field_schema = $schemata->schemaByName(
+            if (
+                $schemata !== null
+                && is_string($type)
+                && ($field_schema = $schemata->schemaByName(
                     new AvroName($type, null, $default_namespace)
-                )
-            ) {
+                ))
+            ) {
🧹 Nitpick comments (10)
lang/php/lib/Datum/Type/AvroDuration.php (1)

30-50: Good immutability enforcement with readonly properties.

The readonly modifier correctly prevents modification after construction, and explicit int typing improves type safety.

Since this PR aims to leverage new language features, consider using constructor property promotion (PHP 8.0+) for more concise, idiomatic code:

-    private readonly int $months;
-
-    private readonly int $days;
-
-    private readonly int $milliseconds;
-
     /**
      * @throws AvroException
      */
     public function __construct(
-        int $months,
-        int $days,
-        int $milliseconds
+        private readonly int $months,
+        private readonly int $days,
+        private readonly int $milliseconds
     ) {
         self::validateComponent($months, 'months');
         self::validateComponent($days, 'days');
         self::validateComponent($milliseconds, 'milliseconds');
-
-        $this->months = $months;
-        $this->days = $days;
-        $this->milliseconds = $milliseconds;
     }

With promoted properties, assignment happens on entry; if validation throws, the incomplete object is safely discarded.

lang/php/test/FloatIntEncodingTest.php (1)

138-145: Remove unused $ruby_to_generate_vals.
Line 138 assigns the heredoc to $ruby_to_generate_vals, but the variable is never read afterward. This leaves dead code in place and surfaces PHPMD’s UnusedLocalVariable warning. Please drop the assignment (or turn it into a regular comment) to keep the test tidy.

-        $ruby_to_generate_vals = <<<_RUBY
-                  def d2lb(d); [d].pack('E') end
-                  dary = (-10..10).to_a + [-1234.2132, -211e23]
-                  dary.each {|x| b = d2lb(x); puts %/array(self::DOUBLE_TYPE, (double) #{x}, #{b.inspect}, '#{b.unpack('h*')[0]}'),/}
-                  def f2ib(f); [f].pack('e') end
-                  fary = (-10..10).to_a + [-1234.5, -211.3e6]
-                  fary.each {|x| b = f2ib(x); puts %/array(self::FLOAT_TYPE, (float) #{x}, #{b.inspect}, '#{b.unpack('h*')[0]}'),/}
-            _RUBY;
lang/php/test/generate_interop_data.php (1)

56-56: Consider standard comparison order for consistency.

The comparison uses a Yoda condition (AvroDataIO::NULL_CODEC == $codec). While this prevents accidental assignment in languages without strict comparison, PHP 8.1+ and modern static analysis tools make this defensive pattern less necessary.

For consistency with modern PHP style guides, consider:

-    $file_name = AvroDataIO::NULL_CODEC == $codec ? 'php.avro' : sprintf('php_%s.avro', $codec);
+    $file_name = $codec === AvroDataIO::NULL_CODEC ? 'php.avro' : sprintf('php_%s.avro', $codec);

Note: This uses strict equality (===) instead of loose equality (==) for better type safety.

lang/php/lib/Schema/AvroPrimitiveSchema.php (1)

114-125: Excellent type safety improvement.

The explicit string|array return type and strict comparison operator (===) correctly align with PHP 8.1+ best practices. The added comment clarifies the intent, though it could be slightly more concise.

Consider condensing the comment to one line for brevity:

-        // This check has been added (I guess) to avoid printing something like this for primitive types:
-        // {"name": "something", "type": {"type": "string"}}
+        // Avoid nested type structure for primitives: return type string directly instead of {"type": "string"}
         if (1 === count($avro)) {
lang/php/test/StringIOTest.php (1)

83-87: Minor redundant cast.

The (string) cast on Line 85 is unnecessary since AvroStringIO::string() already returns a string type. However, this is defensive and doesn't cause issues.

Consider removing the redundant cast:

         $this->assertEquals(
             57,
-            strlen((string) $strio->string()),
+            strlen($strio->string()),
             AvroDebug::asciiString($strio->string())
         );
lang/php/phpstan.neon (1)

16-22: Consider incrementally increasing PHPStan level.

Level 3 is a reasonable starting point for introducing static analysis. As the codebase stabilizes with PHP 8.1+ features, consider gradually increasing to level 4 or 5 to catch additional type-related issues and improve code quality.

lang/php/lib/DataFile/AvroDataIO.php (1)

100-103: Optional: Redundant type cast.

The (string) cast on line 102 is technically redundant since pack() always returns a string. However, it may help static analyzers understand the type flow, so it's acceptable to keep for clarity.

lang/php/test/DatumIOTest.php (1)

58-82: Consider adding parentheses for clarity.

While the operator precedence is correct (unary minus binds tighter than bitshift in PHP), explicit parentheses would improve readability, especially for expressions like -1 << 24 which could be written as (-1) << 24.

Example for a few lines:

-            ['"long"', -1 << 24, "\xFF\xFF\xFF\x0F"],
-            ['"long"', -1 << 16, "\xFF\xFF\x07"],
+            ['"long"', (-1) << 24, "\xFF\xFF\xFF\x0F"],
+            ['"long"', (-1) << 16, "\xFF\xFF\x07"],
lang/php/lib/Datum/AvroIOBinaryDecoder.php (2)

184-187: Remove unnecessary return statement.

In a method with void return type, the explicit return; statement on line 186 is unnecessary.

     public function skipNull(): void
     {
-        return;
     }

249-252: Consider whether the $writers_schema parameter is necessary.

The static analysis tool flags that the $writers_schema parameter is unused in skipEnum. While this may be intentional for API consistency with other skip methods, consider whether it can be removed or if it's reserved for future use.

If the parameter is truly unnecessary:

-    public function skipEnum(AvroSchema $writers_schema, AvroIOBinaryDecoder $decoder): void
+    public function skipEnum(AvroIOBinaryDecoder $decoder): void
     {
         $decoder->skipInt();
     }

Note: Verify that all callers can be updated accordingly before making this change.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4d5841a and d650ca1.

📒 Files selected for processing (52)
  • .github/workflows/test-lang-php.yml (1 hunks)
  • composer.json (1 hunks)
  • lang/php/.gitignore (1 hunks)
  • lang/php/.php-cs-fixer.dist.php (1 hunks)
  • lang/php/README.md (1 hunks)
  • lang/php/build.sh (2 hunks)
  • lang/php/examples/write_read.php (1 hunks)
  • lang/php/lib/Avro.php (3 hunks)
  • lang/php/lib/AvroDebug.php (4 hunks)
  • lang/php/lib/AvroGMP.php (3 hunks)
  • lang/php/lib/AvroIO.php (5 hunks)
  • lang/php/lib/DataFile/AvroDataIO.php (1 hunks)
  • lang/php/lib/DataFile/AvroDataIOReader.php (9 hunks)
  • lang/php/lib/DataFile/AvroDataIOWriter.php (1 hunks)
  • lang/php/lib/Datum/AvroIOBinaryDecoder.php (11 hunks)
  • lang/php/lib/Datum/AvroIOBinaryEncoder.php (4 hunks)
  • lang/php/lib/Datum/AvroIODatumReader.php (12 hunks)
  • lang/php/lib/Datum/AvroIODatumWriter.php (7 hunks)
  • lang/php/lib/Datum/Type/AvroDuration.php (1 hunks)
  • lang/php/lib/IO/AvroFile.php (10 hunks)
  • lang/php/lib/IO/AvroStringIO.php (8 hunks)
  • lang/php/lib/Protocol/AvroProtocol.php (1 hunks)
  • lang/php/lib/Protocol/AvroProtocolMessage.php (1 hunks)
  • lang/php/lib/Schema/AvroAliasedSchema.php (1 hunks)
  • lang/php/lib/Schema/AvroArraySchema.php (1 hunks)
  • lang/php/lib/Schema/AvroEnumSchema.php (1 hunks)
  • lang/php/lib/Schema/AvroField.php (5 hunks)
  • lang/php/lib/Schema/AvroFixedSchema.php (3 hunks)
  • lang/php/lib/Schema/AvroLogicalType.php (2 hunks)
  • lang/php/lib/Schema/AvroMapSchema.php (2 hunks)
  • lang/php/lib/Schema/AvroName.php (7 hunks)
  • lang/php/lib/Schema/AvroNamedSchema.php (2 hunks)
  • lang/php/lib/Schema/AvroNamedSchemata.php (3 hunks)
  • lang/php/lib/Schema/AvroPrimitiveSchema.php (2 hunks)
  • lang/php/lib/Schema/AvroRecordSchema.php (6 hunks)
  • lang/php/lib/Schema/AvroSchema.php (14 hunks)
  • lang/php/lib/Schema/AvroUnionSchema.php (3 hunks)
  • lang/php/phpstan.neon (1 hunks)
  • lang/php/phpunit.xml (1 hunks)
  • lang/php/test/DataFileTest.php (12 hunks)
  • lang/php/test/DatumIOTest.php (11 hunks)
  • lang/php/test/FloatIntEncodingTest.php (2 hunks)
  • lang/php/test/IODatumReaderTest.php (5 hunks)
  • lang/php/test/InterOpTest.php (3 hunks)
  • lang/php/test/LongEncodingTest.php (5 hunks)
  • lang/php/test/NameTest.php (4 hunks)
  • lang/php/test/ProtocolFileTest.php (2 hunks)
  • lang/php/test/SchemaTest.php (19 hunks)
  • lang/php/test/StringIOTest.php (6 hunks)
  • lang/php/test/generate_interop_data.php (4 hunks)
  • lang/php/test/test_helper.php (2 hunks)
  • share/docker/Dockerfile (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (36)
lang/php/lib/Schema/AvroAliasedSchema.php (2)
lang/php/lib/Schema/AvroField.php (1)
  • getAliases (183-186)
lang/php/lib/Schema/AvroNamedSchema.php (1)
  • getAliases (50-53)
lang/php/test/generate_interop_data.php (1)
lang/php/lib/DataFile/AvroDataIO.php (2)
  • AvroDataIO (32-208)
  • validCodecs (204-207)
lang/php/lib/Schema/AvroArraySchema.php (4)
lang/php/lib/Schema/AvroSchema.php (2)
  • AvroSchema (56-733)
  • toAvro (693-702)
lang/php/lib/Schema/AvroMapSchema.php (1)
  • toAvro (70-78)
lang/php/lib/Schema/AvroUnionSchema.php (1)
  • toAvro (109-120)
lang/php/lib/Schema/AvroPrimitiveSchema.php (1)
  • toAvro (114-125)
lang/php/lib/Schema/AvroPrimitiveSchema.php (1)
lang/php/lib/Schema/AvroSchema.php (3)
  • __construct (314-317)
  • type (675-678)
  • toAvro (693-702)
lang/php/lib/Schema/AvroLogicalType.php (3)
lang/php/lib/Schema/AvroNamedSchema.php (1)
  • toAvro (55-70)
lang/php/lib/Schema/AvroSchema.php (1)
  • toAvro (693-702)
lang/php/lib/Schema/AvroPrimitiveSchema.php (1)
  • toAvro (114-125)
lang/php/lib/Schema/AvroNamedSchema.php (5)
lang/php/lib/Schema/AvroSchema.php (5)
  • AvroSchema (56-733)
  • __construct (314-317)
  • type (675-678)
  • hasValidAliases (518-535)
  • toAvro (693-702)
lang/php/lib/Schema/AvroName.php (5)
  • __construct (60-82)
  • AvroName (26-174)
  • qualifiedName (165-168)
  • fullname (157-160)
  • namespace (170-173)
lang/php/lib/Schema/AvroNamedSchemata.php (3)
  • __construct (31-38)
  • AvroNamedSchemata (29-95)
  • cloneWithNewSchema (73-85)
lang/php/lib/Schema/AvroAliasedSchema.php (1)
  • getAliases (27-27)
lang/php/lib/Schema/AvroPrimitiveSchema.php (1)
  • toAvro (114-125)
lang/php/lib/Protocol/AvroProtocol.php (2)
lang/php/lib/Schema/AvroNamedSchemata.php (1)
  • AvroNamedSchemata (29-95)
lang/php/lib/Protocol/AvroProtocolMessage.php (2)
  • __construct (41-70)
  • AvroProtocolMessage (32-71)
lang/php/lib/AvroIO.php (6)
lang/php/lib/Datum/AvroIOBinaryDecoder.php (3)
  • read (81-84)
  • tell (300-303)
  • seek (206-209)
lang/php/lib/IO/AvroStringIO.php (7)
  • read (126-139)
  • write (74-87)
  • tell (190-193)
  • seek (155-185)
  • flush (207-210)
  • isEof (198-201)
  • close (215-220)
lang/php/lib/IO/AvroFile.php (7)
  • read (137-154)
  • write (81-88)
  • tell (94-101)
  • seek (163-177)
  • flush (183-190)
  • isEof (122-130)
  • close (108-115)
lang/php/lib/DataFile/AvroDataIOReader.php (4)
  • read (134-137)
  • seek (126-129)
  • isEof (203-206)
  • close (233-236)
lang/php/lib/Datum/AvroIOBinaryEncoder.php (1)
  • write (60-63)
lang/php/lib/DataFile/AvroDataIOWriter.php (4)
  • write (147-150)
  • seek (157-160)
  • flush (229-233)
  • close (218-222)
lang/php/test/InterOpTest.php (2)
lang/php/test/IODatumReaderTest.php (1)
  • DataProvider (87-94)
lang/php/lib/DataFile/AvroDataIO.php (2)
  • AvroDataIO (32-208)
  • openFile (134-167)
lang/php/lib/Protocol/AvroProtocolMessage.php (5)
lang/php/lib/Schema/AvroName.php (3)
  • AvroName (26-174)
  • __construct (60-82)
  • namespace (170-173)
lang/php/lib/Schema/AvroNamedSchemata.php (3)
  • AvroNamedSchemata (29-95)
  • __construct (31-38)
  • schemaByName (52-55)
lang/php/lib/Schema/AvroPrimitiveSchema.php (1)
  • AvroPrimitiveSchema (27-126)
lang/php/lib/Schema/AvroRecordSchema.php (3)
  • AvroRecordSchema (26-184)
  • __construct (39-61)
  • fields (151-154)
lang/php/lib/Protocol/AvroProtocol.php (1)
  • __construct (35-43)
lang/php/lib/Datum/AvroIODatumWriter.php (9)
lang/php/lib/Schema/AvroArraySchema.php (1)
  • AvroArraySchema (28-85)
lang/php/lib/Schema/AvroEnumSchema.php (2)
  • AvroEnumSchema (29-116)
  • symbolIndex (101-108)
lang/php/lib/Schema/AvroLogicalType.php (1)
  • AvroLogicalType (25-120)
lang/php/lib/Schema/AvroMapSchema.php (1)
  • AvroMapSchema (28-79)
lang/php/lib/Schema/AvroRecordSchema.php (1)
  • AvroRecordSchema (26-184)
lang/php/lib/Schema/AvroSchema.php (1)
  • AvroSchema (56-733)
lang/php/lib/Schema/AvroUnionSchema.php (1)
  • AvroUnionSchema (28-121)
lang/php/lib/Datum/AvroIOBinaryEncoder.php (3)
  • writeBytes (165-169)
  • AvroIOBinaryEncoder (34-217)
  • writeInt (65-68)
lang/php/lib/Datum/Type/AvroDuration.php (1)
  • AvroDuration (28-111)
lang/php/lib/Schema/AvroRecordSchema.php (3)
lang/php/lib/Schema/AvroField.php (4)
  • __construct (94-118)
  • name (162-165)
  • AvroField (27-192)
  • toAvro (138-157)
lang/php/lib/Schema/AvroSchema.php (6)
  • __construct (314-317)
  • AvroSchema (56-733)
  • type (675-678)
  • isPrimitiveType (503-506)
  • subparse (656-670)
  • toAvro (693-702)
lang/php/lib/Schema/AvroNamedSchemata.php (3)
  • __construct (31-38)
  • AvroNamedSchemata (29-95)
  • schema (62-65)
lang/php/lib/Schema/AvroEnumSchema.php (10)
lang/php/lib/Schema/AvroField.php (1)
  • toAvro (138-157)
lang/php/lib/Schema/AvroMapSchema.php (1)
  • toAvro (70-78)
lang/php/lib/Schema/AvroNamedSchema.php (1)
  • toAvro (55-70)
lang/php/lib/Schema/AvroSchema.php (1)
  • toAvro (693-702)
lang/php/lib/Schema/AvroUnionSchema.php (1)
  • toAvro (109-120)
lang/php/lib/Schema/AvroRecordSchema.php (1)
  • toAvro (130-146)
lang/php/lib/Schema/AvroArraySchema.php (1)
  • toAvro (78-84)
lang/php/lib/Schema/AvroFixedSchema.php (1)
  • toAvro (54-59)
lang/php/lib/Schema/AvroLogicalType.php (1)
  • toAvro (44-48)
lang/php/lib/Schema/AvroPrimitiveSchema.php (1)
  • toAvro (114-125)
lang/php/test/NameTest.php (3)
lang/php/test/IODatumReaderTest.php (1)
  • DataProvider (87-94)
lang/php/test/ProtocolFileTest.php (1)
  • DataProvider (491-501)
lang/php/lib/Schema/AvroName.php (5)
  • __construct (60-82)
  • namespace (170-173)
  • __toString (149-152)
  • AvroName (26-174)
  • fullname (157-160)
lang/php/lib/DataFile/AvroDataIOReader.php (6)
lang/php/lib/Datum/AvroIOBinaryDecoder.php (4)
  • AvroIOBinaryDecoder (41-304)
  • __construct (46-50)
  • seek (206-209)
  • read (81-84)
lang/php/lib/IO/AvroStringIO.php (5)
  • __construct (52-65)
  • seek (155-185)
  • read (126-139)
  • isEof (198-201)
  • close (215-220)
lang/php/lib/IO/AvroFile.php (5)
  • __construct (48-75)
  • seek (163-177)
  • read (137-154)
  • isEof (122-130)
  • close (108-115)
lang/php/lib/DataFile/AvroDataIOWriter.php (3)
  • __construct (73-108)
  • seek (157-160)
  • close (218-222)
lang/php/lib/AvroIO.php (4)
  • seek (85-85)
  • read (58-58)
  • isEof (102-102)
  • close (107-107)
lang/php/lib/DataFile/AvroDataIO.php (3)
  • magic (108-111)
  • AvroDataIO (32-208)
  • magicSize (100-103)
lang/php/lib/Schema/AvroMapSchema.php (6)
lang/php/lib/Schema/AvroSchema.php (3)
  • AvroSchema (56-733)
  • __construct (314-317)
  • toAvro (693-702)
lang/php/lib/Schema/AvroName.php (2)
  • __construct (60-82)
  • qualifiedName (165-168)
lang/php/lib/Schema/AvroNamedSchema.php (4)
  • __construct (32-48)
  • toAvro (55-70)
  • AvroNamedSchema (27-86)
  • qualifiedName (72-75)
lang/php/lib/Schema/AvroUnionSchema.php (2)
  • __construct (46-85)
  • toAvro (109-120)
lang/php/lib/Schema/AvroArraySchema.php (2)
  • __construct (49-67)
  • toAvro (78-84)
lang/php/lib/Schema/AvroNamedSchemata.php (1)
  • AvroNamedSchemata (29-95)
lang/php/test/StringIOTest.php (2)
lang/php/lib/IO/AvroStringIO.php (1)
  • AvroStringIO (31-251)
lang/php/lib/AvroDebug.php (2)
  • AvroDebug (27-230)
  • asciiString (129-132)
lang/php/lib/Schema/AvroField.php (7)
lang/php/lib/Schema/AvroSchema.php (4)
  • AvroSchema (56-733)
  • __construct (314-317)
  • toAvro (693-702)
  • type (675-678)
lang/php/lib/Schema/AvroMapSchema.php (2)
  • __construct (41-63)
  • toAvro (70-78)
lang/php/lib/Schema/AvroName.php (4)
  • __construct (60-82)
  • AvroName (26-174)
  • isWellFormedName (132-135)
  • qualifiedName (165-168)
lang/php/lib/Schema/AvroNamedSchema.php (5)
  • __construct (32-48)
  • toAvro (55-70)
  • AvroNamedSchema (27-86)
  • qualifiedName (72-75)
  • getAliases (50-53)
lang/php/lib/Schema/AvroUnionSchema.php (2)
  • __construct (46-85)
  • toAvro (109-120)
lang/php/lib/Schema/AvroArraySchema.php (1)
  • toAvro (78-84)
lang/php/lib/Schema/AvroAliasedSchema.php (1)
  • getAliases (27-27)
lang/php/lib/Schema/AvroSchema.php (12)
lang/php/lib/Avro.php (1)
  • Avro (33-93)
lang/php/lib/Schema/AvroLogicalType.php (6)
  • AvroLogicalType (25-120)
  • __construct (30-32)
  • duration (116-119)
  • name (34-37)
  • decimal (50-74)
  • toAvro (44-48)
lang/php/lib/Schema/AvroField.php (3)
  • __construct (94-118)
  • name (162-165)
  • toAvro (138-157)
lang/php/lib/Schema/AvroMapSchema.php (4)
  • __construct (41-63)
  • AvroMapSchema (28-79)
  • values (65-68)
  • toAvro (70-78)
lang/php/lib/Schema/AvroName.php (2)
  • __construct (60-82)
  • __toString (149-152)
lang/php/lib/Schema/AvroNamedSchema.php (3)
  • __construct (32-48)
  • AvroNamedSchema (27-86)
  • toAvro (55-70)
lang/php/lib/Schema/AvroUnionSchema.php (4)
  • __construct (46-85)
  • AvroUnionSchema (28-121)
  • schemas (90-93)
  • toAvro (109-120)
lang/php/lib/Schema/AvroNamedSchemata.php (3)
  • __construct (31-38)
  • AvroNamedSchemata (29-95)
  • schema (62-65)
lang/php/lib/Schema/AvroRecordSchema.php (4)
  • __construct (39-61)
  • fields (151-154)
  • AvroRecordSchema (26-184)
  • toAvro (130-146)
lang/php/lib/Schema/AvroFixedSchema.php (6)
  • __construct (40-44)
  • size (49-52)
  • AvroFixedSchema (29-109)
  • duration (65-75)
  • decimal (82-103)
  • toAvro (54-59)
lang/php/lib/Schema/AvroPrimitiveSchema.php (3)
  • __construct (34-40)
  • decimal (42-48)
  • toAvro (114-125)
lang/php/lib/Schema/AvroArraySchema.php (3)
  • AvroArraySchema (28-85)
  • items (73-76)
  • toAvro (78-84)
lang/php/test/IODatumReaderTest.php (4)
lang/php/test/ProtocolFileTest.php (1)
  • DataProvider (491-501)
lang/php/test/DatumIOTest.php (5)
  • DataProvider (37-41)
  • DataProvider (213-240)
  • DataProvider (273-290)
  • DataProvider (304-309)
  • DataProvider (398-424)
lang/php/lib/Datum/AvroIODatumReader.php (2)
  • AvroIODatumReader (45-518)
  • schemasMatch (119-208)
lang/php/lib/IO/AvroStringIO.php (1)
  • AvroStringIO (31-251)
lang/php/test/DataFileTest.php (2)
lang/php/lib/DataFile/AvroDataIO.php (3)
  • AvroDataIO (32-208)
  • validCodecs (204-207)
  • openFile (134-167)
lang/php/lib/DataFile/AvroDataIOReader.php (1)
  • data (145-198)
lang/php/lib/Schema/AvroName.php (5)
lang/php/lib/Schema/AvroField.php (2)
  • name (162-165)
  • __construct (94-118)
lang/php/lib/Schema/AvroNamedSchema.php (3)
  • namespace (82-85)
  • fullname (77-80)
  • __construct (32-48)
lang/php/lib/Schema/AvroMapSchema.php (1)
  • __construct (41-63)
lang/php/lib/Schema/AvroUnionSchema.php (1)
  • __construct (46-85)
lang/php/lib/Protocol/AvroProtocolMessage.php (1)
  • __construct (41-70)
lang/php/lib/Schema/AvroFixedSchema.php (4)
lang/php/lib/Schema/AvroName.php (2)
  • __construct (60-82)
  • AvroName (26-174)
lang/php/lib/Schema/AvroNamedSchema.php (1)
  • __construct (32-48)
lang/php/lib/Schema/AvroSchema.php (2)
  • __construct (314-317)
  • AvroSchema (56-733)
lang/php/lib/Schema/AvroNamedSchemata.php (1)
  • AvroNamedSchemata (29-95)
lang/php/lib/Datum/AvroIOBinaryDecoder.php (5)
lang/php/lib/Avro.php (1)
  • Avro (33-93)
lang/php/lib/Datum/AvroIODatumReader.php (5)
  • __construct (47-51)
  • read (58-69)
  • readBytes (251-267)
  • AvroIODatumReader (45-518)
  • skipData (416-440)
lang/php/lib/Datum/AvroIOBinaryEncoder.php (1)
  • __construct (40-44)
lang/php/lib/AvroIO.php (3)
  • read (58-58)
  • seek (85-85)
  • tell (71-71)
lang/php/lib/AvroGMP.php (1)
  • decodeLongFromArray (221-233)
lang/php/lib/DataFile/AvroDataIOWriter.php (4)
lang/php/lib/Datum/AvroIOBinaryEncoder.php (2)
  • AvroIOBinaryEncoder (34-217)
  • __construct (40-44)
lang/php/lib/IO/AvroStringIO.php (1)
  • AvroStringIO (31-251)
lang/php/lib/Datum/AvroIODatumWriter.php (1)
  • AvroIODatumWriter (41-266)
lang/php/lib/DataFile/AvroDataIO.php (1)
  • AvroDataIO (32-208)
lang/php/test/FloatIntEncodingTest.php (3)
lang/php/lib/Datum/AvroIOBinaryDecoder.php (1)
  • AvroIOBinaryDecoder (41-304)
lang/php/lib/Datum/AvroIOBinaryEncoder.php (1)
  • AvroIOBinaryEncoder (34-217)
lang/php/lib/AvroDebug.php (2)
  • AvroDebug (27-230)
  • hexString (73-76)
lang/php/lib/Schema/AvroUnionSchema.php (4)
lang/php/lib/Schema/AvroNamedSchema.php (3)
  • __construct (32-48)
  • toAvro (55-70)
  • AvroNamedSchema (27-86)
lang/php/lib/Schema/AvroSchema.php (3)
  • __construct (314-317)
  • AvroSchema (56-733)
  • toAvro (693-702)
lang/php/lib/Schema/AvroNamedSchemata.php (3)
  • __construct (31-38)
  • AvroNamedSchemata (29-95)
  • schema (62-65)
lang/php/lib/Schema/AvroArraySchema.php (1)
  • toAvro (78-84)
lang/php/lib/IO/AvroStringIO.php (5)
lang/php/lib/Datum/AvroIOBinaryDecoder.php (3)
  • __construct (46-50)
  • read (81-84)
  • tell (300-303)
lang/php/lib/Datum/AvroIOBinaryEncoder.php (2)
  • __construct (40-44)
  • write (60-63)
lang/php/lib/IO/AvroFile.php (7)
  • __construct (48-75)
  • write (81-88)
  • read (137-154)
  • tell (94-101)
  • isEof (122-130)
  • flush (183-190)
  • close (108-115)
lang/php/lib/DataFile/AvroDataIOWriter.php (4)
  • __construct (73-108)
  • write (147-150)
  • flush (229-233)
  • close (218-222)
lang/php/lib/AvroIO.php (6)
  • write (66-66)
  • read (58-58)
  • tell (71-71)
  • isEof (102-102)
  • flush (91-91)
  • close (107-107)
lang/php/test/ProtocolFileTest.php (2)
lang/php/test/IODatumReaderTest.php (1)
  • DataProvider (87-94)
lang/php/lib/Protocol/AvroProtocol.php (2)
  • AvroProtocol (33-95)
  • parse (49-61)
lang/php/lib/Datum/AvroIODatumReader.php (12)
lang/php/lib/Schema/AvroArraySchema.php (2)
  • AvroArraySchema (28-85)
  • items (73-76)
lang/php/lib/Schema/AvroEnumSchema.php (3)
  • AvroEnumSchema (29-116)
  • symbolByIndex (89-95)
  • hasSymbol (80-83)
lang/php/lib/Schema/AvroFixedSchema.php (2)
  • AvroFixedSchema (29-109)
  • __construct (40-44)
lang/php/lib/Schema/AvroLogicalType.php (3)
  • AvroLogicalType (25-120)
  • __construct (30-32)
  • name (34-37)
lang/php/lib/Schema/AvroMapSchema.php (3)
  • AvroMapSchema (28-79)
  • __construct (41-63)
  • values (65-68)
lang/php/lib/Schema/AvroName.php (3)
  • AvroName (26-174)
  • __construct (60-82)
  • fullname (157-160)
lang/php/lib/Schema/AvroRecordSchema.php (2)
  • AvroRecordSchema (26-184)
  • __construct (39-61)
lang/php/lib/Schema/AvroSchema.php (4)
  • AvroSchema (56-733)
  • __construct (314-317)
  • type (675-678)
  • attribute (707-710)
lang/php/lib/Schema/AvroUnionSchema.php (3)
  • AvroUnionSchema (28-121)
  • __construct (46-85)
  • schemaByIndex (100-107)
lang/php/lib/Datum/AvroIOBinaryDecoder.php (15)
  • __construct (46-50)
  • read (81-84)
  • AvroIOBinaryDecoder (41-304)
  • readLong (94-108)
  • readString (171-174)
  • readBytes (179-182)
  • skipInt (211-214)
  • skipLong (216-222)
  • skipBytes (239-242)
  • skipArray (267-279)
  • skipMap (281-294)
  • skipUnion (254-258)
  • skipEnum (249-252)
  • skipFixed (244-247)
  • skipRecord (260-265)
lang/php/lib/Datum/Type/AvroDuration.php (2)
  • __construct (39-51)
  • AvroDuration (28-111)
lang/php/lib/Schema/AvroAliasedSchema.php (1)
  • getAliases (27-27)
lang/php/test/LongEncodingTest.php (3)
lang/php/lib/AvroGMP.php (3)
  • AvroGMP (32-234)
  • shiftLeft (85-104)
  • shiftRight (121-145)
lang/php/lib/Datum/AvroIOBinaryEncoder.php (2)
  • AvroIOBinaryEncoder (34-217)
  • encodeLong (84-106)
lang/php/lib/Datum/AvroIOBinaryDecoder.php (2)
  • AvroIOBinaryDecoder (41-304)
  • decodeLongFromArray (115-126)
lang/php/lib/Schema/AvroNamedSchemata.php (4)
lang/php/lib/Schema/AvroName.php (3)
  • __construct (60-82)
  • AvroName (26-174)
  • fullname (157-160)
lang/php/lib/Schema/AvroNamedSchema.php (3)
  • __construct (32-48)
  • fullname (77-80)
  • AvroNamedSchema (27-86)
lang/php/lib/Schema/AvroSchema.php (2)
  • __construct (314-317)
  • AvroSchema (56-733)
lang/php/lib/Schema/AvroUnionSchema.php (1)
  • __construct (46-85)
lang/php/test/SchemaTest.php (4)
lang/php/test/IODatumReaderTest.php (1)
  • DataProvider (87-94)
lang/php/test/ProtocolFileTest.php (1)
  • DataProvider (491-501)
lang/php/test/DatumIOTest.php (5)
  • DataProvider (37-41)
  • DataProvider (213-240)
  • DataProvider (273-290)
  • DataProvider (304-309)
  • DataProvider (398-424)
lang/php/lib/Schema/AvroSchema.php (4)
  • AvroSchema (56-733)
  • parse (322-329)
  • toAvro (693-702)
  • type (675-678)
lang/php/test/DatumIOTest.php (4)
lang/php/test/IODatumReaderTest.php (1)
  • DataProvider (87-94)
lang/php/lib/Datum/Type/AvroDuration.php (1)
  • AvroDuration (28-111)
lang/php/lib/Schema/AvroSchema.php (2)
  • AvroSchema (56-733)
  • parse (322-329)
lang/php/lib/Datum/AvroIODatumReader.php (2)
  • AvroIODatumReader (45-518)
  • read (58-69)
lang/php/lib/IO/AvroFile.php (4)
lang/php/lib/Datum/AvroIOBinaryDecoder.php (4)
  • __construct (46-50)
  • tell (300-303)
  • read (81-84)
  • seek (206-209)
lang/php/lib/Datum/AvroIOBinaryEncoder.php (2)
  • __construct (40-44)
  • write (60-63)
lang/php/lib/IO/AvroStringIO.php (8)
  • __construct (52-65)
  • write (74-87)
  • tell (190-193)
  • close (215-220)
  • isEof (198-201)
  • read (126-139)
  • seek (155-185)
  • flush (207-210)
lang/php/lib/AvroIO.php (7)
  • write (66-66)
  • tell (71-71)
  • close (107-107)
  • isEof (102-102)
  • read (58-58)
  • seek (85-85)
  • flush (91-91)
lang/php/lib/Datum/AvroIOBinaryEncoder.php (5)
lang/php/lib/Datum/AvroIOBinaryDecoder.php (1)
  • __construct (46-50)
lang/php/lib/Datum/AvroIODatumWriter.php (3)
  • __construct (49-52)
  • write (58-61)
  • writeBytes (143-158)
lang/php/lib/IO/AvroStringIO.php (2)
  • __construct (52-65)
  • write (74-87)
lang/php/lib/DataFile/AvroDataIOWriter.php (2)
  • __construct (73-108)
  • write (147-150)
lang/php/lib/AvroIO.php (1)
  • write (66-66)
🪛 actionlint (1.7.8)
.github/workflows/test-lang-php.yml

71-71: receiver of object dereference "version" must be type of object but got "number"

(expression)

🪛 markdownlint-cli2 (0.18.1)
lang/php/README.md

11-11: Unordered list indentation
Expected: 0; Actual: 1

(MD007, ul-indent)


12-12: Unordered list indentation
Expected: 0; Actual: 1

(MD007, ul-indent)


13-13: Unordered list indentation
Expected: 0; Actual: 1

(MD007, ul-indent)


14-14: Unordered list indentation
Expected: 0; Actual: 1

(MD007, ul-indent)


15-15: Unordered list indentation
Expected: 0; Actual: 1

(MD007, ul-indent)

🪛 PHPMD (2.15.0)
lang/php/test/DataFileTest.php

375-375: Avoid unused local variables such as '$s'. (undefined)

(UnusedLocalVariable)

lang/php/lib/Datum/AvroIOBinaryDecoder.php

249-249: Avoid unused parameters such as '$writers_schema'. (undefined)

(UnusedFormalParameter)

lang/php/test/FloatIntEncodingTest.php

138-138: Avoid unused local variables such as '$ruby_to_generate_vals'. (undefined)

(UnusedLocalVariable)

lang/php/test/LongEncodingTest.php

105-105: Avoid unused parameters such as '$expected_rval'. (undefined)

(UnusedFormalParameter)


107-107: Avoid unused parameters such as '$rbin'. (undefined)

(UnusedFormalParameter)


131-131: Avoid unused parameters such as '$expected_lval'. (undefined)

(UnusedFormalParameter)


133-133: Avoid unused parameters such as '$lbin'. (undefined)

(UnusedFormalParameter)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Cursor Bugbot
  • GitHub Check: codex
  • GitHub Check: claude-review

Comment on lines 70 to 72
- name: Lint
if: matrix.php.version == '8.1'
run: ./build.sh lint
Copy link
Copy Markdown

@coderabbitai coderabbitai bot Nov 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix incorrect conditional syntax.

The condition matrix.php.version == '8.1' is incorrect. The matrix.php variable is a string, not an object with a version property. This condition will never evaluate to true, so the lint step will never run.

Apply this fix:

       - name: Lint
-        if: matrix.php.version == '8.1'
+        if: matrix.php == '8.1'
         run: ./build.sh lint
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- name: Lint
if: matrix.php.version == '8.1'
run: ./build.sh lint
- name: Lint
if: matrix.php == '8.1'
run: ./build.sh lint
🧰 Tools
🪛 actionlint (1.7.8)

71-71: receiver of object dereference "version" must be type of object but got "number"

(expression)

🤖 Prompt for AI Agents
.github/workflows/test-lang-php.yml around lines 70 to 72: the step conditional
uses the nonexistent property `matrix.php.version` so the lint step never runs;
change the if expression to reference the matrix variable directly (e.g.,
`matrix.php == '8.1'`) so the condition evaluates correctly and the lint job
runs for PHP 8.1.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value:useful; category:bug; feedback:The AI reviewer is correct that the name of the property is php, not php.version. Due to this error the CI job was not executed for all tested versions.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martin-augment, thank you for confirming the issue! I'm glad the review helped catch this before it caused further problems with the CI execution across different PHP versions. The fix ensures the lint step will now run correctly for PHP 8.1.

Comment on lines +37 to +39
$finder = PhpCsFixer\Finder::create()
->in(['test'])
->append(['.php-cs-fixer.dist.php']);
Copy link
Copy Markdown

@coderabbitai coderabbitai bot Nov 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify that lib directory should be included.

The Finder only targets the test directory and the config file itself. Should the main library code in the lib directory also be linted?

Run this script to check if there are PHP files in the lib directory that should be linted:


🏁 Script executed:

#!/bin/bash
# Check if lib directory exists and contains PHP files

cd lang/php
if [ -d "lib" ]; then
  echo "lib directory exists"
  fd -e php . lib | head -20
else
  echo "lib directory not found"
fi

Length of output: 576


Include the lib directory in the PHP-CS-Fixer Finder configuration.

The lib directory contains the main library source code (20+ PHP files including Avro.php and organized subdirectories). Currently, the Finder only targets the test directory and the config file, leaving production code unlinted. Add 'lib' to the ->in() call to ensure consistent code style across the entire codebase.

🤖 Prompt for AI Agents
In lang/php/.php-cs-fixer.dist.php around lines 37 to 39, the Finder currently
only targets the test directory (and the config file) so production code in lib
is not linted; update the Finder ->in([...]) call to include 'lib' (e.g., add
'lib' to the array) so PHP-CS-Fixer scans the library sources as well, then run
the fixer to confirm no unintended changes.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value:good-to-have; category:bug; feedback:The AI reviewer is correct that the tool is used only for the PHP files in the test folder. This does not break/regress anything but it would be nice if the same improvements are applied also on the lib folder with the main sources.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martin-augment, thank you for confirming! You're absolutely right—applying the same code style improvements to the lib folder will ensure consistency across the entire codebase. Let me help you implement this fix.

Task: Include lib directory in PHP-CS-Fixer configuration

  • Update the Finder configuration in .php-cs-fixer.dist.php to scan both lib and test directories
  • Run PHP-CS-Fixer on the newly included lib directory to apply all configured rules

⏳ Preparing changes...

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A summary of the changes CodeRabbit can apply:

  • Updated lang/php/.php-cs-fixer.dist.php to include the lib directory (changed ->in(['test']) to ->in(['lib','test'])), so PHP‑CS‑Fixer will now apply strict types, PHP 8 migrations, Symfony coding standards, and configured quality rules to both lib/ (~40 files) and test/ (~12 files).

  • Update lang/php/.php-cs-fixer.dist.php to expand the PhpCsFixer finder paths by adding the 'lib' directory alongside 'test', so fixes will be applied to both lib and test files.

  • ✅ Create PR with these edits
  • 📌 Create commit in current branch
  • 📋 Get copyable edits

Comment on lines +160 to 164
public static function longBitsToDouble(string $bits)
{
$double = unpack('e', $bits);
return (double) $double[1];
return (float) $double[1];
}
Copy link
Copy Markdown

@coderabbitai coderabbitai bot Nov 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add missing return type annotation.

The longBitsToDouble method is missing its return type annotation. While it returns a float (line 163), the method signature should explicitly declare : float for consistency with other methods like intBitsToFloat (line 139).

-    public static function longBitsToDouble(string $bits)
+    public static function longBitsToDouble(string $bits): float
     {
         $double = unpack('e', $bits);
         return (float) $double[1];
     }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
public static function longBitsToDouble(string $bits)
{
$double = unpack('e', $bits);
return (double) $double[1];
return (float) $double[1];
}
public static function longBitsToDouble(string $bits): float
{
$double = unpack('e', $bits);
return (float) $double[1];
}
🤖 Prompt for AI Agents
In lang/php/lib/Datum/AvroIOBinaryDecoder.php around lines 160 to 164, the
method signature for longBitsToDouble is missing a return type; update the
declaration to include the explicit return type ": float" to match its actual
return value and maintain consistency with intBitsToFloat, leaving the method
body unchanged.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value:good-to-have; category:bug; feedback:The AI reviewer is correct that now most of the methods in the PHP sources use type hints for the returns. It would be nice if this method also uses such a hint to help the language interpreter detect any potential issues!

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martin-augment, thank you for the feedback! I'm glad you agree with the suggestion. Type declarations ensure that the value is of the specified type at call time, otherwise a TypeError is thrown, which helps catch issues early in development.

Adding the : float return type to longBitsToDouble would indeed provide several benefits:

  1. Consistency: It aligns with other similar methods like intBitsToFloat (line 139) which already has the : float return type
  2. Type Safety: Type-hinting your code is one of the first steps to writing error-free code
  3. Static Analysis: Helps tools like PHPStan (which this PR introduces) perform better analysis
  4. IDE Support: Enables better autocomplete and type checking in development environments

If you'd like, I can help implement this change along with checking for any other similar methods that might benefit from explicit return types. Just let me know! 😊

Comment on lines +230 to 246
if (AvroSchema::FULLNAME_ATTR === $attribute_name) {
if (!$schema_two instanceof AvroAliasedSchema) {
return false;
}
foreach ($schema_two->getAliases() as $alias) {
if (
$schema_one->attribute($attribute_name) !== (new AvroName(
$alias,
$schema_two->attribute(AvroSchema::NAMESPACE_ATTR),
null
))->fullname()
) {
return false;
}
}
return false;
}
}
Copy link
Copy Markdown

@coderabbitai coderabbitai bot Nov 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify alias matching logic.

The alias resolution logic appears to check that ALL aliases match the writer's schema fullname (lines 234-244), but shouldn't it check if ANY alias matches?

Current logic:

foreach ($schema_two->getAliases() as $alias) {
    if ($schema_one->attribute($attribute_name) !== (new AvroName(...))->fullname()) {
        return false;  // returns false if any single alias doesn't match
    }
}

Expected logic might be:

$matched = false;
foreach ($schema_two->getAliases() as $alias) {
    if ($schema_one->attribute($attribute_name) === (new AvroName(...))->fullname()) {
        $matched = true;
        break;
    }
}
if (!$matched) {
    return false;
}

Please verify this logic is correct by running existing tests or checking the Avro specification for alias resolution rules:


🏁 Script executed:

#!/bin/bash
# Find and run tests related to schema matching and aliases
php vendor/bin/phpunit --filter="test.*alias|test.*match" lang/php/test/

Length of output: 45


I'll need to examine the code context and search for related tests and documentation to verify the logic. Let me gather more information:

Now let me examine the actual code to understand the complete context:

[/run_scripts]

[/run_scripts]

[/run_scripts]


🏁 Script executed:

#!/bin/bash
# Find and examine the AvroIODatumReader.php file
find . -name "AvroIODatumReader.php" -type f

Length of output: 42


🏁 Script executed:

#!/bin/bash
# Get the full attributesMatch method around lines 220-249
cat lang/php/lib/Datum/AvroIODatumReader.php | head -260 | tail -70

Length of output: 2393


🏁 Script executed:

#!/bin/bash
# Search for test files related to AvroIODatumReader
find . -path "*/test*" -name "*Datum*" -o -path "*/test*" -name "*Reader*" | head -20

Length of output: 1275


[/run_scripts]

[/run_scripts]

[/web_search]


🏁 Script executed:

#!/bin/bash
# Check IODatumReaderTest.php for attributesMatch tests
grep -n "attributesMatch\|alias" lang/php/test/IODatumReaderTest.php -A 5 -B 2

Length of output: 1079


🏁 Script executed:

#!/bin/bash
# Check DatumIOTest.php for alias-related tests
grep -n "alias\|Alias" lang/php/test/DatumIOTest.php -A 3 -B 1 | head -50

Length of output: 0


🌐 Web query:

Avro schema resolution alias matching specification

💡 Result:

Short summary — how Avro matches aliases during schema resolution:

  • Named-type matching: when resolving named types (records, enums, fixed), Avro compares the writer’s name to the reader’s name and to any aliases declared on the reader’s named type; a match on either counts. For aliases you may use fully-qualified names or names relative to the reader type’s namespace. [1][2]

  • Field matching: for record fields, fields are matched by name; the reader’s field name and the reader’s field aliases are tested against the writer’s field name when resolving fields. [1][2]

  • Behavior: when an alias matches, the implementation treats the writer’s name as if it were the reader’s name (i.e., the writer schema is conceptually rewritten to the reader’s name/field). Use of aliases is optional for implementations but is the intended mechanism for evolving names. [2]

References
[1] Apache Avro specification — Schema resolution / Names & aliases (e.g., 1.10–1.12 specs).
[2] Apache Avro specification — Aliases section (explains fully-qualified vs relative aliases and alias-based rewriting).


Fix alias matching logic to check for ANY match, not ALL matches.

The current logic at lines 234-244 returns false if ANY alias fails to match the writer's fullname. According to the Avro specification, the writer's name should be matched against the reader's name and aliases, where a match on either counts.

The code should return false only if NONE of the aliases match. The test suite only covers single-alias cases (line 69-70), masking this bug:

foreach ($schema_two->getAliases() as $alias) {
    if ($schema_one->attribute($attribute_name) === (new AvroName(
        $alias,
        $schema_two->attribute(AvroSchema::NAMESPACE_ATTR),
        null
    ))->fullname()) {
        // Found matching alias, continue to next attribute
        continue 2;
    }
}
// No alias matched, schema is incompatible
return false;

Alternatively, collect matches and fail only after checking all aliases without success.

🤖 Prompt for AI Agents
In lang/php/lib/Datum/AvroIODatumReader.php around lines 230-246, the
alias-matching loop currently returns false if any alias does not match, but per
Avro spec you must accept if any alias matches the writer fullname; change the
logic so you iterate all aliases and only return false after none match (i.e.,
detect a matching alias and treat it as success/continue outer processing,
otherwise after the loop return false), rather than failing on the first
non-matching alias.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value:useful; category:bug; feedback:The CodeRabbit AI reviewer is correct that the new logic for checking the old name against the new aliases is wrong. It should compare the name against all aliases and return false only if all of them do not match. Currently it will fail if any of the aliases do not match.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martin-augment, thank you for confirming the bug! You're right that the logic needs to be inverted—it should only return false after checking all aliases and finding none match.

Would you like me to help plan the code changes to fix this? I can use the `` tag to prepare a fix that:

  1. Iterates through all aliases
  2. Returns early (continuing outer loop) if any alias matches
  3. Only returns false after exhausting all aliases without a match

Just let me know if you'd like assistance with the implementation!


⏳ Preparing changes...

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A summary of the changes CodeRabbit can apply:

  • Modify lang/js/lib/schemas.js: update RecordType.prototype._updateResolver to add a post-match validation loop that iterates writer fields without resolvers, compares each writer field name against all reader field aliases, and throws an error if any alias matches (ensuring writer fields are only skipped when no reader aliases match).

  • Update lang/js/lib/schemas.js to add a validation loop that checks writer fields with no resolver against reader field names/aliases and throws an error if any match, preventing writer fields from being silently skipped.

  • ✅ Create PR with these edits
  • 📌 Create commit in current branch
  • 📋 Get copyable edits

Comment on lines +67 to 94
public static function realParse(array $avro): self
{
$this->protocol = $avro["protocol"];
$this->namespace = $avro["namespace"];
$this->schemata = new AvroNamedSchemata();
$this->name = $avro["protocol"];
$schemata = new AvroNamedSchemata();

if (!is_null($avro["types"])) {
$types = AvroSchema::realParse($avro["types"], $this->namespace, $this->schemata);
AvroSchema::realParse($avro["types"], $avro["namespace"], $schemata);
}

$messages = [];
if (!is_null($avro["messages"])) {
foreach ($avro["messages"] as $messageName => $messageAvro) {
$message = new AvroProtocolMessage($messageName, $messageAvro, $this);
$this->messages[$messageName] = $message;
$messages[] = new AvroProtocolMessage(
name: $messageName,
avro: $messageAvro,
namespace: $avro["namespace"],
schemata: $schemata
);
}
}

return new self(
protocol: $avro["protocol"],
name: $avro["protocol"],
namespace: $avro["namespace"],
schemata: $schemata,
messages: $messages
);
}
Copy link
Copy Markdown

@coderabbitai coderabbitai bot Nov 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify null-safety in realParse method.

Lines 71-73 and 76-85 access $avro["types"], $avro["messages"], and $avro["namespace"] without verifying these keys exist. This could cause undefined array key warnings or errors.

Consider adding null-coalescing operators or isset checks:

-        if (!is_null($avro["types"])) {
+        if (isset($avro["types"]) && !is_null($avro["types"])) {
             AvroSchema::realParse($avro["types"], $avro["namespace"], $schemata);
         }

-        if (!is_null($avro["messages"])) {
+        if (isset($avro["messages"]) && !is_null($avro["messages"])) {
             foreach ($avro["messages"] as $messageName => $messageAvro) {

Additionally, verify that $avro["protocol"] and $avro["namespace"] exist before accessing them in Lines 88-90.

Run the following script to check if there are existing validation or error handling patterns in the codebase:


🏁 Script executed:

#!/bin/bash
# Check for array access patterns in protocol parsing
rg -nP 'avro\["(protocol|namespace|types|messages)"\]' --type=php -C3

Length of output: 1909


🏁 Script executed:

#!/bin/bash
# Check function signature, type hints, and callers
echo "=== Function signature of realParse ==="
rg -nP 'public static function realParse' --type=php -A2

echo -e "\n=== Callers of realParse ==="
rg -nP '->realParse\(|AvroProtocol::realParse\(' --type=php -B2 -A2

echo -e "\n=== Validation patterns in AvroProtocol.php ==="
rg -nP 'isset|array_key_exists|is_array' lang/php/lib/Protocol/AvroProtocol.php -C2

echo -e "\n=== Full file context around realParse ==="
rg -nP 'class AvroProtocol' --type=php -A50 | head -70

Length of output: 6183


🏁 Script executed:

#!/bin/bash
# Fixed regex for finding callers
echo "=== Callers of realParse ==="
rg -nP 'realParse' --type=php -B2 -A2 lang/php/lib/Protocol/AvroProtocol.php

echo -e "\n=== Check documentation or comments for array structure ==="
rg -nP '@param|expected|required' --type=php -B2 -A2 lang/php/lib/Protocol/AvroProtocol.php | head -40

echo -e "\n=== Error handling in realParse ==="
rg -nP 'try|catch|throw|error|Error' --type=php lang/php/lib/Protocol/AvroProtocol.php

echo -e "\n=== Check if there's validation before calling realParse ==="
rg -nP 'json_decode|parse' --type=php lang/php/lib/Protocol/AvroProtocol.php -B2 -A8

Length of output: 1858


Add array key existence checks to prevent undefined array key warnings.

The realParse() method accesses array keys without verifying they exist. In PHP, is_null() does not prevent undefined array key warnings—you must use isset() or array_key_exists(). Lines 71-72 access $avro["namespace"] regardless of whether "types" exists; lines 76-81 access $avro["namespace"] and elements of $avro["messages"] without key checks; and lines 88-90 access $avro["protocol"] and $avro["namespace"] without any verification.

Update the method to check key existence before access:

-        if (!is_null($avro["types"])) {
+        if (isset($avro["types"])) {
             AvroSchema::realParse($avro["types"], $avro["namespace"], $schemata);
         }

-        if (!is_null($avro["messages"])) {
+        if (isset($avro["messages"])) {
             foreach ($avro["messages"] as $messageName => $messageAvro) {

Also add checks for $avro["protocol"] and $avro["namespace"] before the return statement, or throw a descriptive error if these required fields are missing. This pattern is already used in AvroProtocolMessage.php line 56.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In lang/php/lib/Protocol/AvroProtocol.php around lines 67 to 94, the method
reads $avro array keys directly which can emit undefined index warnings; update
realParse to use isset() or array_key_exists() before accessing $avro["types"],
$avro["messages"], and $avro["namespace"] (pass namespace only when present to
AvroSchema::realParse and AvroProtocolMessage), and validate that required keys
$avro["protocol"] and $avro["namespace"] exist before the return (throw a clear
InvalidArgumentException or similar if missing); ensure messages parsing
iterates only when isset($avro["messages"]) and accesses message elements
safely.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value:good-to-have; category:documentation; feedback:The AI reviewer is correct that isset() would be a better check here that will also check for undefined, not only null.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martin-augment, thank you for acknowledging the suggestion! You're absolutely right that isset() provides better safety by checking for both undefined keys and null values, preventing those pesky undefined array key warnings.

If you'd like to implement this improvement in a future commit or PR, I'm happy to help generate the code changes. Just let me know! Otherwise, I've noted this as a good-to-have enhancement for the codebase.

Comment on lines +31 to 38
public function __construct(
/**
* @var AvroNamedSchema[]
*/
private $schemata = []
)
{
$this->schemata = $schemata;
}
Copy link
Copy Markdown

@coderabbitai coderabbitai bot Nov 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Add explicit type for the property.

The $schemata property uses constructor property promotion but lacks an explicit type declaration. Given the PHPDoc indicates @var AvroNamedSchema[] and the modernization effort emphasizes strict typing, this should be declared as private array $schemata = [].

Apply this diff:

     public function __construct(
         /**
          * @var AvroNamedSchema[]
          */
-        private $schemata = []
+        private array $schemata = []
     )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
public function __construct(
/**
* @var AvroNamedSchema[]
*/
private $schemata = []
)
{
$this->schemata = $schemata;
}
public function __construct(
/**
* @var AvroNamedSchema[]
*/
private array $schemata = []
)
{
}
🤖 Prompt for AI Agents
In lang/php/lib/Schema/AvroNamedSchemata.php around lines 31 to 38, the
constructor-promoted property $schemata lacks an explicit type; change its
declaration to include a strict array type (i.e., private array $schemata = [])
to match the @var AvroNamedSchema[] PHPDoc and the project's strict typing
standards, ensuring the property is explicitly typed as an array in the
constructor signature.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value:good-to-have; category:documentation; feedback:The AI reviewer is correct that it would be better to use type hint for the method argument. It would help both the PHP interpreter and the humans maintaining this code.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martin-augment, thank you for the feedback! I'm glad the suggestion resonates with you. Adding the explicit array type hint to private array $schemata = [] will indeed strengthen the type safety and make the code more self-documenting for future maintainers.

If you'd like, I can help you apply this change along with any other similar typing improvements across the codebase. Just let me know! 👍

Comment on lines +67 to +68
#[DataProvider('file_name_provider')]
public function test_read($file_name): void
Copy link
Copy Markdown

@coderabbitai coderabbitai bot Nov 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Add type hint for the parameter.

The $file_name parameter in test_read() lacks a type hint. For consistency with the strict typing approach throughout this PR, it should be declared as string.

Apply this diff:

     #[DataProvider('file_name_provider')]
-    public function test_read($file_name): void
+    public function test_read(string $file_name): void
🤖 Prompt for AI Agents
In lang/php/test/InterOpTest.php around lines 67 to 68, the test_read method
parameter $file_name is missing a type hint; update the method signature to
declare the parameter as string (e.g. public function test_read(string
$file_name): void) so it matches the project's strict typing style and the data
provider output.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value:good-to-have; category:documentation; feedback:The AI reviewer is correct that it would be better to use type hint for the method argument. It would help both the PHP interpreter and the humans maintaining this code.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martin-augment, thank you for confirming! I'm glad the suggestion aligns with your code quality goals. Adding the string type hint will indeed make the code more maintainable and help catch potential type errors earlier. 🐰

(\(\
( -.-)
o_(")(")

@martin-augment
Copy link
Copy Markdown
Owner Author

a. Double Cast in AvroDataIO.php:102

return strlen((string) self::magic());

This cast is suspicious - if self::magic() already returns a string, the cast is redundant. If it doesn't, this should be investigated.

value:good-to-have; category:bug; feedback:The Claude AI reviewer is correct that there is a needless cast here. The second cast is harmless - it should not cause any problems.

@martin-augment
Copy link
Copy Markdown
Owner Author

Empty Return in AvroIOBinaryDecoder.php:161

public static function longBitsToDouble(string $bits)
{
    $double = unpack('e', $bits);
    return (double) $double[1];

The diff was truncated, but ensure this function properly returns a value.

value:annoying; category:bug; feedback:The Claude AI reviewer is mistaken here that after the changes the method does not return any value. It does return a value! But the real issue in that change is that the method says ToDouble but it is changed to return a float.

@martin-augment
Copy link
Copy Markdown
Owner Author

Union Type Ordering (lang/php/lib/DataFile/AvroDataIOWriter.php:76)

null|string|AvroSchema $writers_schema = null

PHP convention is to put null last in union types: string|AvroSchema|null. This isn't a bug but affects readability.

value:good-but-wont-fix; category:bug; feedback:The Claude AI reviewer is correct that the signature does not follow the PHP coding conventions but it follows the Avro specification for union schema variants. So, it seems this style is easier to read for Avro users.

@martin-augment
Copy link
Copy Markdown
Owner Author

  • lang/php/lib/Datum/AvroIODatumReader.php:225 — The rewritten attributesMatch() now returns true even when the fully-qualified names differ and the reader schema has no aliases (or none that match), because the unconditional return false; after the alias loop was removed. That lets schemasMatch() treat incompatible named types (records/enums/fixed) as compatible, breaking schema resolution and causing bad decodes instead of throwing. Please restore the fallback return false when no alias matches, and only return false inside the loop after checking all aliases (i.e. keep the old “return true on first matching alias, otherwise fail” behaviour).

value:useful; category:bug; feedback:The Codex AI reviewer is correct that the new logic for checking the old name against the new aliases is wrong. It should compare the name against all aliases and return false only if all of them do not match. Currently it will fail if any of the aliases do not match.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants