3528: AVRO-4194: remove PHP EOL versions and use new language features#17
3528: AVRO-4194: remove PHP EOL versions and use new language features#17martin-augment wants to merge 16 commits intomainfrom
Conversation
WalkthroughThis PR modernizes the PHP Avro implementation by requiring PHP 8.1+, adding static analysis tooling (PHPStan, PHP-CS-Fixer, Rector), implementing strict typed properties and return types throughout the codebase, converting AvroIO from a class to an interface, and migrating tests to use PHP 8 attributes for data providers. Changes
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
Pull Request Review: PHP 8.1+ Modernization (AVRO-4194)OverviewThis PR modernizes the PHP Avro library by dropping support for EOL PHP versions (7.3, 7.4, 8.0) and adopting PHP 8.1+ language features. Overall, this is a well-structured modernization effort with significant improvements to code quality and maintainability. ✅ Strengths1. Modern PHP Features Adoption
2. Architecture Improvements
3. Tooling & Quality Assurance
4. Test Coverage
|
| $aliases | ||
| name: $new_name, | ||
| schemata: $schemata, | ||
| aliases: $aliases |
There was a problem hiding this comment.
Bug: Silent doc loss in Avro duration schemas
The AvroFixedSchema::duration() method doesn't accept a doc parameter, but $doc is extracted from the schema at line 382. In the old code, $doc was passed to the duration factory method but was being ignored. While the new code correctly doesn't pass $doc, the extracted $doc value (line 382) is never used for duration schemas, which means documentation is silently discarded for duration logical types. The AvroFixedSchema constructor (called inside duration()) passes null for the doc parameter to parent AvroNamedSchema, discarding any doc attribute from the schema definition.
There was a problem hiding this comment.
value:useful; category:bug; feedback:The Bugbot AI agent is correct that the changes in the PR make the doc (documentation) property of Fixed schemas even more unused. Before it was passed to the constructor but not used. Now it is not even passed. Technically, there is no regression but it would have been better to make use of it instead of dropping it completely.
| } | ||
|
|
||
| $value = $decimal * (10 ** $scale); | ||
| $value = ((float) $decimal) * (10 ** $scale); |
There was a problem hiding this comment.
Casting $decimal to float before scaling can lose precision and lead to incorrect decimal encoding for larger magnitudes or scales; consider avoiding float conversion and using an integer/arbitrary-precision approach to compute the unscaled value (also applies to other locations in the PR if similar).
🤖 React with 👍 or 👎 to let us know if the comment was useful.
There was a problem hiding this comment.
value:useful; category:bug; feedback:The Augment AI reviewer is correct that the casting here looks wrong. It prevents precision loss. It would be better to use a special decimal type/library that preserves the precision.
|
There was a problem hiding this comment.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (9)
lang/php/lib/Schema/AvroName.php (1)
157-168: Add explicit return types for consistency.Methods
fullname()andqualifiedName()lack explicit return type declarations while other methods in this class (e.g.,namespace()at line 170) now have them. For consistency with the modernization effort, these should declare: stringreturn types.Apply this diff to add explicit return types:
- public function fullname() + public function fullname(): string { return $this->fullname; } - public function qualifiedName() + public function qualifiedName(): string { return $this->qualified_name; }lang/php/lib/Schema/AvroMapSchema.php (1)
41-63: Verify null-safe access to $schemata parameter.The constructor calls
$schemata->schemaByName()on Line 49 without first checking if$schematais null. Since the parameter is nullable (?AvroNamedSchemata &$schemata = null), this will cause a null pointer error if$schematais not provided.Apply this diff to add a null check:
$this->isValuesSchemaFromSchemata = false; $values_schema = null; if ( is_string($values) + && $schemata !== null && $values_schema = $schemata->schemaByName( new AvroName($values, null, $defaultNamespace) )lang/php/lib/IO/AvroStringIO.php (2)
126-139: Restore interface-compatiblereadsignatureLine 126 drops the
inttype hint on$len, soAvroStringIO::read()no longer matchesAvroIO::read(int $len): string. In PHP 8.1 this triggers a fatal “must be compatible” error when the class is loaded. Please keep the integer parameter type.- public function read($len): string + public function read(int $len): string
169-172: Fix SEEK_CUR bounds checkLine 169 uses
$this->current_index + $whencewhen guarding against underflow, but$whenceis the mode constant (0/1/2), not the displacement. With a negative$offsetthe guard never fires, letting$current_indexfall below zero and leading to string-offset errors later. Compare against$offsetinstead.- if (0 > $this->current_index + $whence) { + if (0 > $this->current_index + $offset) {lang/php/lib/Schema/AvroUnionSchema.php (1)
50-83: Initialise$this->schemasbefore useLines 80-83 append to
$this->schemas, but the typed property is never initialised. In PHP 8.1 a statement like$this->schemas[] = …on an uninitialised typed property throws “must not be accessed before initialization”, so constructing unions now blows up. Seed the property with an empty array before the loop.- $this->schemaFromSchemataIndices = []; - $schema_types = []; + $this->schemaFromSchemataIndices = []; + $this->schemas = []; + $schema_types = [];lang/php/lib/Schema/AvroField.php (1)
80-117: Initialise$defaultto preserve old semantics
$defaultis only assigned when$hasDefaultis true. For fields without defaults, callingdefaultValue()(or other access) now throws “Typed property AvroField::$default must not be accessed before initialization”. Previously this quietly returnednull, so this is a new runtime break.- private mixed $default; + private mixed $default = null;lang/php/lib/DataFile/AvroDataIOWriter.php (1)
100-106: Default codec must be enforced when reading existing filesWhen we open an existing data file (no writer schema passed in), older Avro containers legitimately omit the
avro.codecmetadata key and rely on the implicit"null"codec. Accessing that key now returnsnull, and because$this->codecis a string-typed property, PHP 8.1+ throws aTypeError, breaking reads of every legacy file. We need to fall back toAvroDataIO::NULL_CODECbefore assigning and keep the metadata consistent.- $this->metadata[AvroDataIO::METADATA_CODEC_ATTR] = $this->codec - = $dfr->metadata[AvroDataIO::METADATA_CODEC_ATTR]; + $codecFromFile = $dfr->metadata[AvroDataIO::METADATA_CODEC_ATTR] ?? AvroDataIO::NULL_CODEC; + $this->metadata[AvroDataIO::METADATA_CODEC_ATTR] = $this->codec = $codecFromFile;lang/php/lib/DataFile/AvroDataIOReader.php (1)
75-83: Restore null-codec default to avoid new TypeError
$this->metadatacan legitimately lackavro.codec(older files) which used to mean “null” codec. After the refactor$codecbecomesnull, and assigning it to the string-typed$this->codecimmediately raises aTypeError, so every legacy container fails to open. Please restore the implicit default before validation.- $codec = $this->metadata[AvroDataIO::METADATA_CODEC_ATTR] ?? null; - if ($codec && !AvroDataIO::isValidCodec($codec)) { + $codec = $this->metadata[AvroDataIO::METADATA_CODEC_ATTR] ?? AvroDataIO::NULL_CODEC; + if (!AvroDataIO::isValidCodec($codec)) { throw new AvroDataIOException(sprintf('Unknown codec: %s', $codec)); } $this->codec = $codec;lang/php/lib/Schema/AvroRecordSchema.php (1)
94-109: Guard$schematabefore dereferencing
AvroRecordSchema::__constructstill defaults$schematato null, and external callers rely on that. The new code unconditionally calls$schemata->schemaByName(...), sonew AvroRecordSchema($name, $doc, $fields)now fatals with “Call to a member function schemaByName() on null.” Please reintroduce the null guard before dereferencing.- if ( - is_string($type) - && $field_schema = $schemata->schemaByName( + if ( + $schemata !== null + && is_string($type) + && ($field_schema = $schemata->schemaByName( new AvroName($type, null, $default_namespace) - ) - ) { + )) + ) {
🧹 Nitpick comments (10)
lang/php/lib/Datum/Type/AvroDuration.php (1)
30-50: Good immutability enforcement with readonly properties.The
readonlymodifier correctly prevents modification after construction, and explicitinttyping improves type safety.Since this PR aims to leverage new language features, consider using constructor property promotion (PHP 8.0+) for more concise, idiomatic code:
- private readonly int $months; - - private readonly int $days; - - private readonly int $milliseconds; - /** * @throws AvroException */ public function __construct( - int $months, - int $days, - int $milliseconds + private readonly int $months, + private readonly int $days, + private readonly int $milliseconds ) { self::validateComponent($months, 'months'); self::validateComponent($days, 'days'); self::validateComponent($milliseconds, 'milliseconds'); - - $this->months = $months; - $this->days = $days; - $this->milliseconds = $milliseconds; }With promoted properties, assignment happens on entry; if validation throws, the incomplete object is safely discarded.
lang/php/test/FloatIntEncodingTest.php (1)
138-145: Remove unused$ruby_to_generate_vals.
Line 138 assigns the heredoc to$ruby_to_generate_vals, but the variable is never read afterward. This leaves dead code in place and surfaces PHPMD’s UnusedLocalVariable warning. Please drop the assignment (or turn it into a regular comment) to keep the test tidy.- $ruby_to_generate_vals = <<<_RUBY - def d2lb(d); [d].pack('E') end - dary = (-10..10).to_a + [-1234.2132, -211e23] - dary.each {|x| b = d2lb(x); puts %/array(self::DOUBLE_TYPE, (double) #{x}, #{b.inspect}, '#{b.unpack('h*')[0]}'),/} - def f2ib(f); [f].pack('e') end - fary = (-10..10).to_a + [-1234.5, -211.3e6] - fary.each {|x| b = f2ib(x); puts %/array(self::FLOAT_TYPE, (float) #{x}, #{b.inspect}, '#{b.unpack('h*')[0]}'),/} - _RUBY;lang/php/test/generate_interop_data.php (1)
56-56: Consider standard comparison order for consistency.The comparison uses a Yoda condition (
AvroDataIO::NULL_CODEC == $codec). While this prevents accidental assignment in languages without strict comparison, PHP 8.1+ and modern static analysis tools make this defensive pattern less necessary.For consistency with modern PHP style guides, consider:
- $file_name = AvroDataIO::NULL_CODEC == $codec ? 'php.avro' : sprintf('php_%s.avro', $codec); + $file_name = $codec === AvroDataIO::NULL_CODEC ? 'php.avro' : sprintf('php_%s.avro', $codec);Note: This uses strict equality (
===) instead of loose equality (==) for better type safety.lang/php/lib/Schema/AvroPrimitiveSchema.php (1)
114-125: Excellent type safety improvement.The explicit
string|arrayreturn type and strict comparison operator (===) correctly align with PHP 8.1+ best practices. The added comment clarifies the intent, though it could be slightly more concise.Consider condensing the comment to one line for brevity:
- // This check has been added (I guess) to avoid printing something like this for primitive types: - // {"name": "something", "type": {"type": "string"}} + // Avoid nested type structure for primitives: return type string directly instead of {"type": "string"} if (1 === count($avro)) {lang/php/test/StringIOTest.php (1)
83-87: Minor redundant cast.The
(string)cast on Line 85 is unnecessary sinceAvroStringIO::string()already returns a string type. However, this is defensive and doesn't cause issues.Consider removing the redundant cast:
$this->assertEquals( 57, - strlen((string) $strio->string()), + strlen($strio->string()), AvroDebug::asciiString($strio->string()) );lang/php/phpstan.neon (1)
16-22: Consider incrementally increasing PHPStan level.Level 3 is a reasonable starting point for introducing static analysis. As the codebase stabilizes with PHP 8.1+ features, consider gradually increasing to level 4 or 5 to catch additional type-related issues and improve code quality.
lang/php/lib/DataFile/AvroDataIO.php (1)
100-103: Optional: Redundant type cast.The
(string)cast on line 102 is technically redundant sincepack()always returns a string. However, it may help static analyzers understand the type flow, so it's acceptable to keep for clarity.lang/php/test/DatumIOTest.php (1)
58-82: Consider adding parentheses for clarity.While the operator precedence is correct (unary minus binds tighter than bitshift in PHP), explicit parentheses would improve readability, especially for expressions like
-1 << 24which could be written as(-1) << 24.Example for a few lines:
- ['"long"', -1 << 24, "\xFF\xFF\xFF\x0F"], - ['"long"', -1 << 16, "\xFF\xFF\x07"], + ['"long"', (-1) << 24, "\xFF\xFF\xFF\x0F"], + ['"long"', (-1) << 16, "\xFF\xFF\x07"],lang/php/lib/Datum/AvroIOBinaryDecoder.php (2)
184-187: Remove unnecessary return statement.In a method with
voidreturn type, the explicitreturn;statement on line 186 is unnecessary.public function skipNull(): void { - return; }
249-252: Consider whether the$writers_schemaparameter is necessary.The static analysis tool flags that the
$writers_schemaparameter is unused inskipEnum. While this may be intentional for API consistency with other skip methods, consider whether it can be removed or if it's reserved for future use.If the parameter is truly unnecessary:
- public function skipEnum(AvroSchema $writers_schema, AvroIOBinaryDecoder $decoder): void + public function skipEnum(AvroIOBinaryDecoder $decoder): void { $decoder->skipInt(); }Note: Verify that all callers can be updated accordingly before making this change.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (52)
.github/workflows/test-lang-php.yml(1 hunks)composer.json(1 hunks)lang/php/.gitignore(1 hunks)lang/php/.php-cs-fixer.dist.php(1 hunks)lang/php/README.md(1 hunks)lang/php/build.sh(2 hunks)lang/php/examples/write_read.php(1 hunks)lang/php/lib/Avro.php(3 hunks)lang/php/lib/AvroDebug.php(4 hunks)lang/php/lib/AvroGMP.php(3 hunks)lang/php/lib/AvroIO.php(5 hunks)lang/php/lib/DataFile/AvroDataIO.php(1 hunks)lang/php/lib/DataFile/AvroDataIOReader.php(9 hunks)lang/php/lib/DataFile/AvroDataIOWriter.php(1 hunks)lang/php/lib/Datum/AvroIOBinaryDecoder.php(11 hunks)lang/php/lib/Datum/AvroIOBinaryEncoder.php(4 hunks)lang/php/lib/Datum/AvroIODatumReader.php(12 hunks)lang/php/lib/Datum/AvroIODatumWriter.php(7 hunks)lang/php/lib/Datum/Type/AvroDuration.php(1 hunks)lang/php/lib/IO/AvroFile.php(10 hunks)lang/php/lib/IO/AvroStringIO.php(8 hunks)lang/php/lib/Protocol/AvroProtocol.php(1 hunks)lang/php/lib/Protocol/AvroProtocolMessage.php(1 hunks)lang/php/lib/Schema/AvroAliasedSchema.php(1 hunks)lang/php/lib/Schema/AvroArraySchema.php(1 hunks)lang/php/lib/Schema/AvroEnumSchema.php(1 hunks)lang/php/lib/Schema/AvroField.php(5 hunks)lang/php/lib/Schema/AvroFixedSchema.php(3 hunks)lang/php/lib/Schema/AvroLogicalType.php(2 hunks)lang/php/lib/Schema/AvroMapSchema.php(2 hunks)lang/php/lib/Schema/AvroName.php(7 hunks)lang/php/lib/Schema/AvroNamedSchema.php(2 hunks)lang/php/lib/Schema/AvroNamedSchemata.php(3 hunks)lang/php/lib/Schema/AvroPrimitiveSchema.php(2 hunks)lang/php/lib/Schema/AvroRecordSchema.php(6 hunks)lang/php/lib/Schema/AvroSchema.php(14 hunks)lang/php/lib/Schema/AvroUnionSchema.php(3 hunks)lang/php/phpstan.neon(1 hunks)lang/php/phpunit.xml(1 hunks)lang/php/test/DataFileTest.php(12 hunks)lang/php/test/DatumIOTest.php(11 hunks)lang/php/test/FloatIntEncodingTest.php(2 hunks)lang/php/test/IODatumReaderTest.php(5 hunks)lang/php/test/InterOpTest.php(3 hunks)lang/php/test/LongEncodingTest.php(5 hunks)lang/php/test/NameTest.php(4 hunks)lang/php/test/ProtocolFileTest.php(2 hunks)lang/php/test/SchemaTest.php(19 hunks)lang/php/test/StringIOTest.php(6 hunks)lang/php/test/generate_interop_data.php(4 hunks)lang/php/test/test_helper.php(2 hunks)share/docker/Dockerfile(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (36)
lang/php/lib/Schema/AvroAliasedSchema.php (2)
lang/php/lib/Schema/AvroField.php (1)
getAliases(183-186)lang/php/lib/Schema/AvroNamedSchema.php (1)
getAliases(50-53)
lang/php/test/generate_interop_data.php (1)
lang/php/lib/DataFile/AvroDataIO.php (2)
AvroDataIO(32-208)validCodecs(204-207)
lang/php/lib/Schema/AvroArraySchema.php (4)
lang/php/lib/Schema/AvroSchema.php (2)
AvroSchema(56-733)toAvro(693-702)lang/php/lib/Schema/AvroMapSchema.php (1)
toAvro(70-78)lang/php/lib/Schema/AvroUnionSchema.php (1)
toAvro(109-120)lang/php/lib/Schema/AvroPrimitiveSchema.php (1)
toAvro(114-125)
lang/php/lib/Schema/AvroPrimitiveSchema.php (1)
lang/php/lib/Schema/AvroSchema.php (3)
__construct(314-317)type(675-678)toAvro(693-702)
lang/php/lib/Schema/AvroLogicalType.php (3)
lang/php/lib/Schema/AvroNamedSchema.php (1)
toAvro(55-70)lang/php/lib/Schema/AvroSchema.php (1)
toAvro(693-702)lang/php/lib/Schema/AvroPrimitiveSchema.php (1)
toAvro(114-125)
lang/php/lib/Schema/AvroNamedSchema.php (5)
lang/php/lib/Schema/AvroSchema.php (5)
AvroSchema(56-733)__construct(314-317)type(675-678)hasValidAliases(518-535)toAvro(693-702)lang/php/lib/Schema/AvroName.php (5)
__construct(60-82)AvroName(26-174)qualifiedName(165-168)fullname(157-160)namespace(170-173)lang/php/lib/Schema/AvroNamedSchemata.php (3)
__construct(31-38)AvroNamedSchemata(29-95)cloneWithNewSchema(73-85)lang/php/lib/Schema/AvroAliasedSchema.php (1)
getAliases(27-27)lang/php/lib/Schema/AvroPrimitiveSchema.php (1)
toAvro(114-125)
lang/php/lib/Protocol/AvroProtocol.php (2)
lang/php/lib/Schema/AvroNamedSchemata.php (1)
AvroNamedSchemata(29-95)lang/php/lib/Protocol/AvroProtocolMessage.php (2)
__construct(41-70)AvroProtocolMessage(32-71)
lang/php/lib/AvroIO.php (6)
lang/php/lib/Datum/AvroIOBinaryDecoder.php (3)
read(81-84)tell(300-303)seek(206-209)lang/php/lib/IO/AvroStringIO.php (7)
read(126-139)write(74-87)tell(190-193)seek(155-185)flush(207-210)isEof(198-201)close(215-220)lang/php/lib/IO/AvroFile.php (7)
read(137-154)write(81-88)tell(94-101)seek(163-177)flush(183-190)isEof(122-130)close(108-115)lang/php/lib/DataFile/AvroDataIOReader.php (4)
read(134-137)seek(126-129)isEof(203-206)close(233-236)lang/php/lib/Datum/AvroIOBinaryEncoder.php (1)
write(60-63)lang/php/lib/DataFile/AvroDataIOWriter.php (4)
write(147-150)seek(157-160)flush(229-233)close(218-222)
lang/php/test/InterOpTest.php (2)
lang/php/test/IODatumReaderTest.php (1)
DataProvider(87-94)lang/php/lib/DataFile/AvroDataIO.php (2)
AvroDataIO(32-208)openFile(134-167)
lang/php/lib/Protocol/AvroProtocolMessage.php (5)
lang/php/lib/Schema/AvroName.php (3)
AvroName(26-174)__construct(60-82)namespace(170-173)lang/php/lib/Schema/AvroNamedSchemata.php (3)
AvroNamedSchemata(29-95)__construct(31-38)schemaByName(52-55)lang/php/lib/Schema/AvroPrimitiveSchema.php (1)
AvroPrimitiveSchema(27-126)lang/php/lib/Schema/AvroRecordSchema.php (3)
AvroRecordSchema(26-184)__construct(39-61)fields(151-154)lang/php/lib/Protocol/AvroProtocol.php (1)
__construct(35-43)
lang/php/lib/Datum/AvroIODatumWriter.php (9)
lang/php/lib/Schema/AvroArraySchema.php (1)
AvroArraySchema(28-85)lang/php/lib/Schema/AvroEnumSchema.php (2)
AvroEnumSchema(29-116)symbolIndex(101-108)lang/php/lib/Schema/AvroLogicalType.php (1)
AvroLogicalType(25-120)lang/php/lib/Schema/AvroMapSchema.php (1)
AvroMapSchema(28-79)lang/php/lib/Schema/AvroRecordSchema.php (1)
AvroRecordSchema(26-184)lang/php/lib/Schema/AvroSchema.php (1)
AvroSchema(56-733)lang/php/lib/Schema/AvroUnionSchema.php (1)
AvroUnionSchema(28-121)lang/php/lib/Datum/AvroIOBinaryEncoder.php (3)
writeBytes(165-169)AvroIOBinaryEncoder(34-217)writeInt(65-68)lang/php/lib/Datum/Type/AvroDuration.php (1)
AvroDuration(28-111)
lang/php/lib/Schema/AvroRecordSchema.php (3)
lang/php/lib/Schema/AvroField.php (4)
__construct(94-118)name(162-165)AvroField(27-192)toAvro(138-157)lang/php/lib/Schema/AvroSchema.php (6)
__construct(314-317)AvroSchema(56-733)type(675-678)isPrimitiveType(503-506)subparse(656-670)toAvro(693-702)lang/php/lib/Schema/AvroNamedSchemata.php (3)
__construct(31-38)AvroNamedSchemata(29-95)schema(62-65)
lang/php/lib/Schema/AvroEnumSchema.php (10)
lang/php/lib/Schema/AvroField.php (1)
toAvro(138-157)lang/php/lib/Schema/AvroMapSchema.php (1)
toAvro(70-78)lang/php/lib/Schema/AvroNamedSchema.php (1)
toAvro(55-70)lang/php/lib/Schema/AvroSchema.php (1)
toAvro(693-702)lang/php/lib/Schema/AvroUnionSchema.php (1)
toAvro(109-120)lang/php/lib/Schema/AvroRecordSchema.php (1)
toAvro(130-146)lang/php/lib/Schema/AvroArraySchema.php (1)
toAvro(78-84)lang/php/lib/Schema/AvroFixedSchema.php (1)
toAvro(54-59)lang/php/lib/Schema/AvroLogicalType.php (1)
toAvro(44-48)lang/php/lib/Schema/AvroPrimitiveSchema.php (1)
toAvro(114-125)
lang/php/test/NameTest.php (3)
lang/php/test/IODatumReaderTest.php (1)
DataProvider(87-94)lang/php/test/ProtocolFileTest.php (1)
DataProvider(491-501)lang/php/lib/Schema/AvroName.php (5)
__construct(60-82)namespace(170-173)__toString(149-152)AvroName(26-174)fullname(157-160)
lang/php/lib/DataFile/AvroDataIOReader.php (6)
lang/php/lib/Datum/AvroIOBinaryDecoder.php (4)
AvroIOBinaryDecoder(41-304)__construct(46-50)seek(206-209)read(81-84)lang/php/lib/IO/AvroStringIO.php (5)
__construct(52-65)seek(155-185)read(126-139)isEof(198-201)close(215-220)lang/php/lib/IO/AvroFile.php (5)
__construct(48-75)seek(163-177)read(137-154)isEof(122-130)close(108-115)lang/php/lib/DataFile/AvroDataIOWriter.php (3)
__construct(73-108)seek(157-160)close(218-222)lang/php/lib/AvroIO.php (4)
seek(85-85)read(58-58)isEof(102-102)close(107-107)lang/php/lib/DataFile/AvroDataIO.php (3)
magic(108-111)AvroDataIO(32-208)magicSize(100-103)
lang/php/lib/Schema/AvroMapSchema.php (6)
lang/php/lib/Schema/AvroSchema.php (3)
AvroSchema(56-733)__construct(314-317)toAvro(693-702)lang/php/lib/Schema/AvroName.php (2)
__construct(60-82)qualifiedName(165-168)lang/php/lib/Schema/AvroNamedSchema.php (4)
__construct(32-48)toAvro(55-70)AvroNamedSchema(27-86)qualifiedName(72-75)lang/php/lib/Schema/AvroUnionSchema.php (2)
__construct(46-85)toAvro(109-120)lang/php/lib/Schema/AvroArraySchema.php (2)
__construct(49-67)toAvro(78-84)lang/php/lib/Schema/AvroNamedSchemata.php (1)
AvroNamedSchemata(29-95)
lang/php/test/StringIOTest.php (2)
lang/php/lib/IO/AvroStringIO.php (1)
AvroStringIO(31-251)lang/php/lib/AvroDebug.php (2)
AvroDebug(27-230)asciiString(129-132)
lang/php/lib/Schema/AvroField.php (7)
lang/php/lib/Schema/AvroSchema.php (4)
AvroSchema(56-733)__construct(314-317)toAvro(693-702)type(675-678)lang/php/lib/Schema/AvroMapSchema.php (2)
__construct(41-63)toAvro(70-78)lang/php/lib/Schema/AvroName.php (4)
__construct(60-82)AvroName(26-174)isWellFormedName(132-135)qualifiedName(165-168)lang/php/lib/Schema/AvroNamedSchema.php (5)
__construct(32-48)toAvro(55-70)AvroNamedSchema(27-86)qualifiedName(72-75)getAliases(50-53)lang/php/lib/Schema/AvroUnionSchema.php (2)
__construct(46-85)toAvro(109-120)lang/php/lib/Schema/AvroArraySchema.php (1)
toAvro(78-84)lang/php/lib/Schema/AvroAliasedSchema.php (1)
getAliases(27-27)
lang/php/lib/Schema/AvroSchema.php (12)
lang/php/lib/Avro.php (1)
Avro(33-93)lang/php/lib/Schema/AvroLogicalType.php (6)
AvroLogicalType(25-120)__construct(30-32)duration(116-119)name(34-37)decimal(50-74)toAvro(44-48)lang/php/lib/Schema/AvroField.php (3)
__construct(94-118)name(162-165)toAvro(138-157)lang/php/lib/Schema/AvroMapSchema.php (4)
__construct(41-63)AvroMapSchema(28-79)values(65-68)toAvro(70-78)lang/php/lib/Schema/AvroName.php (2)
__construct(60-82)__toString(149-152)lang/php/lib/Schema/AvroNamedSchema.php (3)
__construct(32-48)AvroNamedSchema(27-86)toAvro(55-70)lang/php/lib/Schema/AvroUnionSchema.php (4)
__construct(46-85)AvroUnionSchema(28-121)schemas(90-93)toAvro(109-120)lang/php/lib/Schema/AvroNamedSchemata.php (3)
__construct(31-38)AvroNamedSchemata(29-95)schema(62-65)lang/php/lib/Schema/AvroRecordSchema.php (4)
__construct(39-61)fields(151-154)AvroRecordSchema(26-184)toAvro(130-146)lang/php/lib/Schema/AvroFixedSchema.php (6)
__construct(40-44)size(49-52)AvroFixedSchema(29-109)duration(65-75)decimal(82-103)toAvro(54-59)lang/php/lib/Schema/AvroPrimitiveSchema.php (3)
__construct(34-40)decimal(42-48)toAvro(114-125)lang/php/lib/Schema/AvroArraySchema.php (3)
AvroArraySchema(28-85)items(73-76)toAvro(78-84)
lang/php/test/IODatumReaderTest.php (4)
lang/php/test/ProtocolFileTest.php (1)
DataProvider(491-501)lang/php/test/DatumIOTest.php (5)
DataProvider(37-41)DataProvider(213-240)DataProvider(273-290)DataProvider(304-309)DataProvider(398-424)lang/php/lib/Datum/AvroIODatumReader.php (2)
AvroIODatumReader(45-518)schemasMatch(119-208)lang/php/lib/IO/AvroStringIO.php (1)
AvroStringIO(31-251)
lang/php/test/DataFileTest.php (2)
lang/php/lib/DataFile/AvroDataIO.php (3)
AvroDataIO(32-208)validCodecs(204-207)openFile(134-167)lang/php/lib/DataFile/AvroDataIOReader.php (1)
data(145-198)
lang/php/lib/Schema/AvroName.php (5)
lang/php/lib/Schema/AvroField.php (2)
name(162-165)__construct(94-118)lang/php/lib/Schema/AvroNamedSchema.php (3)
namespace(82-85)fullname(77-80)__construct(32-48)lang/php/lib/Schema/AvroMapSchema.php (1)
__construct(41-63)lang/php/lib/Schema/AvroUnionSchema.php (1)
__construct(46-85)lang/php/lib/Protocol/AvroProtocolMessage.php (1)
__construct(41-70)
lang/php/lib/Schema/AvroFixedSchema.php (4)
lang/php/lib/Schema/AvroName.php (2)
__construct(60-82)AvroName(26-174)lang/php/lib/Schema/AvroNamedSchema.php (1)
__construct(32-48)lang/php/lib/Schema/AvroSchema.php (2)
__construct(314-317)AvroSchema(56-733)lang/php/lib/Schema/AvroNamedSchemata.php (1)
AvroNamedSchemata(29-95)
lang/php/lib/Datum/AvroIOBinaryDecoder.php (5)
lang/php/lib/Avro.php (1)
Avro(33-93)lang/php/lib/Datum/AvroIODatumReader.php (5)
__construct(47-51)read(58-69)readBytes(251-267)AvroIODatumReader(45-518)skipData(416-440)lang/php/lib/Datum/AvroIOBinaryEncoder.php (1)
__construct(40-44)lang/php/lib/AvroIO.php (3)
read(58-58)seek(85-85)tell(71-71)lang/php/lib/AvroGMP.php (1)
decodeLongFromArray(221-233)
lang/php/lib/DataFile/AvroDataIOWriter.php (4)
lang/php/lib/Datum/AvroIOBinaryEncoder.php (2)
AvroIOBinaryEncoder(34-217)__construct(40-44)lang/php/lib/IO/AvroStringIO.php (1)
AvroStringIO(31-251)lang/php/lib/Datum/AvroIODatumWriter.php (1)
AvroIODatumWriter(41-266)lang/php/lib/DataFile/AvroDataIO.php (1)
AvroDataIO(32-208)
lang/php/test/FloatIntEncodingTest.php (3)
lang/php/lib/Datum/AvroIOBinaryDecoder.php (1)
AvroIOBinaryDecoder(41-304)lang/php/lib/Datum/AvroIOBinaryEncoder.php (1)
AvroIOBinaryEncoder(34-217)lang/php/lib/AvroDebug.php (2)
AvroDebug(27-230)hexString(73-76)
lang/php/lib/Schema/AvroUnionSchema.php (4)
lang/php/lib/Schema/AvroNamedSchema.php (3)
__construct(32-48)toAvro(55-70)AvroNamedSchema(27-86)lang/php/lib/Schema/AvroSchema.php (3)
__construct(314-317)AvroSchema(56-733)toAvro(693-702)lang/php/lib/Schema/AvroNamedSchemata.php (3)
__construct(31-38)AvroNamedSchemata(29-95)schema(62-65)lang/php/lib/Schema/AvroArraySchema.php (1)
toAvro(78-84)
lang/php/lib/IO/AvroStringIO.php (5)
lang/php/lib/Datum/AvroIOBinaryDecoder.php (3)
__construct(46-50)read(81-84)tell(300-303)lang/php/lib/Datum/AvroIOBinaryEncoder.php (2)
__construct(40-44)write(60-63)lang/php/lib/IO/AvroFile.php (7)
__construct(48-75)write(81-88)read(137-154)tell(94-101)isEof(122-130)flush(183-190)close(108-115)lang/php/lib/DataFile/AvroDataIOWriter.php (4)
__construct(73-108)write(147-150)flush(229-233)close(218-222)lang/php/lib/AvroIO.php (6)
write(66-66)read(58-58)tell(71-71)isEof(102-102)flush(91-91)close(107-107)
lang/php/test/ProtocolFileTest.php (2)
lang/php/test/IODatumReaderTest.php (1)
DataProvider(87-94)lang/php/lib/Protocol/AvroProtocol.php (2)
AvroProtocol(33-95)parse(49-61)
lang/php/lib/Datum/AvroIODatumReader.php (12)
lang/php/lib/Schema/AvroArraySchema.php (2)
AvroArraySchema(28-85)items(73-76)lang/php/lib/Schema/AvroEnumSchema.php (3)
AvroEnumSchema(29-116)symbolByIndex(89-95)hasSymbol(80-83)lang/php/lib/Schema/AvroFixedSchema.php (2)
AvroFixedSchema(29-109)__construct(40-44)lang/php/lib/Schema/AvroLogicalType.php (3)
AvroLogicalType(25-120)__construct(30-32)name(34-37)lang/php/lib/Schema/AvroMapSchema.php (3)
AvroMapSchema(28-79)__construct(41-63)values(65-68)lang/php/lib/Schema/AvroName.php (3)
AvroName(26-174)__construct(60-82)fullname(157-160)lang/php/lib/Schema/AvroRecordSchema.php (2)
AvroRecordSchema(26-184)__construct(39-61)lang/php/lib/Schema/AvroSchema.php (4)
AvroSchema(56-733)__construct(314-317)type(675-678)attribute(707-710)lang/php/lib/Schema/AvroUnionSchema.php (3)
AvroUnionSchema(28-121)__construct(46-85)schemaByIndex(100-107)lang/php/lib/Datum/AvroIOBinaryDecoder.php (15)
__construct(46-50)read(81-84)AvroIOBinaryDecoder(41-304)readLong(94-108)readString(171-174)readBytes(179-182)skipInt(211-214)skipLong(216-222)skipBytes(239-242)skipArray(267-279)skipMap(281-294)skipUnion(254-258)skipEnum(249-252)skipFixed(244-247)skipRecord(260-265)lang/php/lib/Datum/Type/AvroDuration.php (2)
__construct(39-51)AvroDuration(28-111)lang/php/lib/Schema/AvroAliasedSchema.php (1)
getAliases(27-27)
lang/php/test/LongEncodingTest.php (3)
lang/php/lib/AvroGMP.php (3)
AvroGMP(32-234)shiftLeft(85-104)shiftRight(121-145)lang/php/lib/Datum/AvroIOBinaryEncoder.php (2)
AvroIOBinaryEncoder(34-217)encodeLong(84-106)lang/php/lib/Datum/AvroIOBinaryDecoder.php (2)
AvroIOBinaryDecoder(41-304)decodeLongFromArray(115-126)
lang/php/lib/Schema/AvroNamedSchemata.php (4)
lang/php/lib/Schema/AvroName.php (3)
__construct(60-82)AvroName(26-174)fullname(157-160)lang/php/lib/Schema/AvroNamedSchema.php (3)
__construct(32-48)fullname(77-80)AvroNamedSchema(27-86)lang/php/lib/Schema/AvroSchema.php (2)
__construct(314-317)AvroSchema(56-733)lang/php/lib/Schema/AvroUnionSchema.php (1)
__construct(46-85)
lang/php/test/SchemaTest.php (4)
lang/php/test/IODatumReaderTest.php (1)
DataProvider(87-94)lang/php/test/ProtocolFileTest.php (1)
DataProvider(491-501)lang/php/test/DatumIOTest.php (5)
DataProvider(37-41)DataProvider(213-240)DataProvider(273-290)DataProvider(304-309)DataProvider(398-424)lang/php/lib/Schema/AvroSchema.php (4)
AvroSchema(56-733)parse(322-329)toAvro(693-702)type(675-678)
lang/php/test/DatumIOTest.php (4)
lang/php/test/IODatumReaderTest.php (1)
DataProvider(87-94)lang/php/lib/Datum/Type/AvroDuration.php (1)
AvroDuration(28-111)lang/php/lib/Schema/AvroSchema.php (2)
AvroSchema(56-733)parse(322-329)lang/php/lib/Datum/AvroIODatumReader.php (2)
AvroIODatumReader(45-518)read(58-69)
lang/php/lib/IO/AvroFile.php (4)
lang/php/lib/Datum/AvroIOBinaryDecoder.php (4)
__construct(46-50)tell(300-303)read(81-84)seek(206-209)lang/php/lib/Datum/AvroIOBinaryEncoder.php (2)
__construct(40-44)write(60-63)lang/php/lib/IO/AvroStringIO.php (8)
__construct(52-65)write(74-87)tell(190-193)close(215-220)isEof(198-201)read(126-139)seek(155-185)flush(207-210)lang/php/lib/AvroIO.php (7)
write(66-66)tell(71-71)close(107-107)isEof(102-102)read(58-58)seek(85-85)flush(91-91)
lang/php/lib/Datum/AvroIOBinaryEncoder.php (5)
lang/php/lib/Datum/AvroIOBinaryDecoder.php (1)
__construct(46-50)lang/php/lib/Datum/AvroIODatumWriter.php (3)
__construct(49-52)write(58-61)writeBytes(143-158)lang/php/lib/IO/AvroStringIO.php (2)
__construct(52-65)write(74-87)lang/php/lib/DataFile/AvroDataIOWriter.php (2)
__construct(73-108)write(147-150)lang/php/lib/AvroIO.php (1)
write(66-66)
🪛 actionlint (1.7.8)
.github/workflows/test-lang-php.yml
71-71: receiver of object dereference "version" must be type of object but got "number"
(expression)
🪛 markdownlint-cli2 (0.18.1)
lang/php/README.md
11-11: Unordered list indentation
Expected: 0; Actual: 1
(MD007, ul-indent)
12-12: Unordered list indentation
Expected: 0; Actual: 1
(MD007, ul-indent)
13-13: Unordered list indentation
Expected: 0; Actual: 1
(MD007, ul-indent)
14-14: Unordered list indentation
Expected: 0; Actual: 1
(MD007, ul-indent)
15-15: Unordered list indentation
Expected: 0; Actual: 1
(MD007, ul-indent)
🪛 PHPMD (2.15.0)
lang/php/test/DataFileTest.php
375-375: Avoid unused local variables such as '$s'. (undefined)
(UnusedLocalVariable)
lang/php/lib/Datum/AvroIOBinaryDecoder.php
249-249: Avoid unused parameters such as '$writers_schema'. (undefined)
(UnusedFormalParameter)
lang/php/test/FloatIntEncodingTest.php
138-138: Avoid unused local variables such as '$ruby_to_generate_vals'. (undefined)
(UnusedLocalVariable)
lang/php/test/LongEncodingTest.php
105-105: Avoid unused parameters such as '$expected_rval'. (undefined)
(UnusedFormalParameter)
107-107: Avoid unused parameters such as '$rbin'. (undefined)
(UnusedFormalParameter)
131-131: Avoid unused parameters such as '$expected_lval'. (undefined)
(UnusedFormalParameter)
133-133: Avoid unused parameters such as '$lbin'. (undefined)
(UnusedFormalParameter)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Cursor Bugbot
- GitHub Check: codex
- GitHub Check: claude-review
| - name: Lint | ||
| if: matrix.php.version == '8.1' | ||
| run: ./build.sh lint |
There was a problem hiding this comment.
Fix incorrect conditional syntax.
The condition matrix.php.version == '8.1' is incorrect. The matrix.php variable is a string, not an object with a version property. This condition will never evaluate to true, so the lint step will never run.
Apply this fix:
- name: Lint
- if: matrix.php.version == '8.1'
+ if: matrix.php == '8.1'
run: ./build.sh lint📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - name: Lint | |
| if: matrix.php.version == '8.1' | |
| run: ./build.sh lint | |
| - name: Lint | |
| if: matrix.php == '8.1' | |
| run: ./build.sh lint |
🧰 Tools
🪛 actionlint (1.7.8)
71-71: receiver of object dereference "version" must be type of object but got "number"
(expression)
🤖 Prompt for AI Agents
.github/workflows/test-lang-php.yml around lines 70 to 72: the step conditional
uses the nonexistent property `matrix.php.version` so the lint step never runs;
change the if expression to reference the matrix variable directly (e.g.,
`matrix.php == '8.1'`) so the condition evaluates correctly and the lint job
runs for PHP 8.1.
There was a problem hiding this comment.
value:useful; category:bug; feedback:The AI reviewer is correct that the name of the property is php, not php.version. Due to this error the CI job was not executed for all tested versions.
There was a problem hiding this comment.
@martin-augment, thank you for confirming the issue! I'm glad the review helped catch this before it caused further problems with the CI execution across different PHP versions. The fix ensures the lint step will now run correctly for PHP 8.1.
| $finder = PhpCsFixer\Finder::create() | ||
| ->in(['test']) | ||
| ->append(['.php-cs-fixer.dist.php']); |
There was a problem hiding this comment.
🧩 Analysis chain
Verify that lib directory should be included.
The Finder only targets the test directory and the config file itself. Should the main library code in the lib directory also be linted?
Run this script to check if there are PHP files in the lib directory that should be linted:
🏁 Script executed:
#!/bin/bash
# Check if lib directory exists and contains PHP files
cd lang/php
if [ -d "lib" ]; then
echo "lib directory exists"
fd -e php . lib | head -20
else
echo "lib directory not found"
fiLength of output: 576
Include the lib directory in the PHP-CS-Fixer Finder configuration.
The lib directory contains the main library source code (20+ PHP files including Avro.php and organized subdirectories). Currently, the Finder only targets the test directory and the config file, leaving production code unlinted. Add 'lib' to the ->in() call to ensure consistent code style across the entire codebase.
🤖 Prompt for AI Agents
In lang/php/.php-cs-fixer.dist.php around lines 37 to 39, the Finder currently
only targets the test directory (and the config file) so production code in lib
is not linted; update the Finder ->in([...]) call to include 'lib' (e.g., add
'lib' to the array) so PHP-CS-Fixer scans the library sources as well, then run
the fixer to confirm no unintended changes.
There was a problem hiding this comment.
value:good-to-have; category:bug; feedback:The AI reviewer is correct that the tool is used only for the PHP files in the test folder. This does not break/regress anything but it would be nice if the same improvements are applied also on the lib folder with the main sources.
There was a problem hiding this comment.
@martin-augment, thank you for confirming! You're absolutely right—applying the same code style improvements to the lib folder will ensure consistency across the entire codebase. Let me help you implement this fix.
Task: Include lib directory in PHP-CS-Fixer configuration
- Update the Finder configuration in
.php-cs-fixer.dist.phpto scan bothlibandtestdirectories - Run PHP-CS-Fixer on the newly included
libdirectory to apply all configured rules
⏳ Preparing changes...
There was a problem hiding this comment.
A summary of the changes CodeRabbit can apply:
Updated lang/php/.php-cs-fixer.dist.php to include the lib directory (changed ->in(['test']) to ->in(['lib','test'])), so PHP‑CS‑Fixer will now apply strict types, PHP 8 migrations, Symfony coding standards, and configured quality rules to both lib/ (~40 files) and test/ (~12 files).
Update lang/php/.php-cs-fixer.dist.php to expand the PhpCsFixer finder paths by adding the 'lib' directory alongside 'test', so fixes will be applied to both lib and test files.
- ✅ Create PR with these edits
- 📌 Create commit in current branch
- 📋 Get copyable edits
| public static function longBitsToDouble(string $bits) | ||
| { | ||
| $double = unpack('e', $bits); | ||
| return (double) $double[1]; | ||
| return (float) $double[1]; | ||
| } |
There was a problem hiding this comment.
Add missing return type annotation.
The longBitsToDouble method is missing its return type annotation. While it returns a float (line 163), the method signature should explicitly declare : float for consistency with other methods like intBitsToFloat (line 139).
- public static function longBitsToDouble(string $bits)
+ public static function longBitsToDouble(string $bits): float
{
$double = unpack('e', $bits);
return (float) $double[1];
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| public static function longBitsToDouble(string $bits) | |
| { | |
| $double = unpack('e', $bits); | |
| return (double) $double[1]; | |
| return (float) $double[1]; | |
| } | |
| public static function longBitsToDouble(string $bits): float | |
| { | |
| $double = unpack('e', $bits); | |
| return (float) $double[1]; | |
| } |
🤖 Prompt for AI Agents
In lang/php/lib/Datum/AvroIOBinaryDecoder.php around lines 160 to 164, the
method signature for longBitsToDouble is missing a return type; update the
declaration to include the explicit return type ": float" to match its actual
return value and maintain consistency with intBitsToFloat, leaving the method
body unchanged.
There was a problem hiding this comment.
value:good-to-have; category:bug; feedback:The AI reviewer is correct that now most of the methods in the PHP sources use type hints for the returns. It would be nice if this method also uses such a hint to help the language interpreter detect any potential issues!
There was a problem hiding this comment.
@martin-augment, thank you for the feedback! I'm glad you agree with the suggestion. Type declarations ensure that the value is of the specified type at call time, otherwise a TypeError is thrown, which helps catch issues early in development.
Adding the : float return type to longBitsToDouble would indeed provide several benefits:
- Consistency: It aligns with other similar methods like
intBitsToFloat(line 139) which already has the: floatreturn type - Type Safety: Type-hinting your code is one of the first steps to writing error-free code
- Static Analysis: Helps tools like PHPStan (which this PR introduces) perform better analysis
- IDE Support: Enables better autocomplete and type checking in development environments
If you'd like, I can help implement this change along with checking for any other similar methods that might benefit from explicit return types. Just let me know! 😊
| if (AvroSchema::FULLNAME_ATTR === $attribute_name) { | ||
| if (!$schema_two instanceof AvroAliasedSchema) { | ||
| return false; | ||
| } | ||
| foreach ($schema_two->getAliases() as $alias) { | ||
| if ( | ||
| $schema_one->attribute($attribute_name) !== (new AvroName( | ||
| $alias, | ||
| $schema_two->attribute(AvroSchema::NAMESPACE_ATTR), | ||
| null | ||
| ))->fullname() | ||
| ) { | ||
| return false; | ||
| } | ||
| } | ||
| return false; | ||
| } | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
Verify alias matching logic.
The alias resolution logic appears to check that ALL aliases match the writer's schema fullname (lines 234-244), but shouldn't it check if ANY alias matches?
Current logic:
foreach ($schema_two->getAliases() as $alias) {
if ($schema_one->attribute($attribute_name) !== (new AvroName(...))->fullname()) {
return false; // returns false if any single alias doesn't match
}
}Expected logic might be:
$matched = false;
foreach ($schema_two->getAliases() as $alias) {
if ($schema_one->attribute($attribute_name) === (new AvroName(...))->fullname()) {
$matched = true;
break;
}
}
if (!$matched) {
return false;
}Please verify this logic is correct by running existing tests or checking the Avro specification for alias resolution rules:
🏁 Script executed:
#!/bin/bash
# Find and run tests related to schema matching and aliases
php vendor/bin/phpunit --filter="test.*alias|test.*match" lang/php/test/Length of output: 45
I'll need to examine the code context and search for related tests and documentation to verify the logic. Let me gather more information:
Now let me examine the actual code to understand the complete context:
[/run_scripts]
[/run_scripts]
[/run_scripts]
🏁 Script executed:
#!/bin/bash
# Find and examine the AvroIODatumReader.php file
find . -name "AvroIODatumReader.php" -type fLength of output: 42
🏁 Script executed:
#!/bin/bash
# Get the full attributesMatch method around lines 220-249
cat lang/php/lib/Datum/AvroIODatumReader.php | head -260 | tail -70Length of output: 2393
🏁 Script executed:
#!/bin/bash
# Search for test files related to AvroIODatumReader
find . -path "*/test*" -name "*Datum*" -o -path "*/test*" -name "*Reader*" | head -20Length of output: 1275
[/run_scripts]
[/run_scripts]
[/web_search]
🏁 Script executed:
#!/bin/bash
# Check IODatumReaderTest.php for attributesMatch tests
grep -n "attributesMatch\|alias" lang/php/test/IODatumReaderTest.php -A 5 -B 2Length of output: 1079
🏁 Script executed:
#!/bin/bash
# Check DatumIOTest.php for alias-related tests
grep -n "alias\|Alias" lang/php/test/DatumIOTest.php -A 3 -B 1 | head -50Length of output: 0
🌐 Web query:
Avro schema resolution alias matching specification
💡 Result:
Short summary — how Avro matches aliases during schema resolution:
-
Named-type matching: when resolving named types (records, enums, fixed), Avro compares the writer’s name to the reader’s name and to any aliases declared on the reader’s named type; a match on either counts. For aliases you may use fully-qualified names or names relative to the reader type’s namespace. [1][2]
-
Field matching: for record fields, fields are matched by name; the reader’s field name and the reader’s field aliases are tested against the writer’s field name when resolving fields. [1][2]
-
Behavior: when an alias matches, the implementation treats the writer’s name as if it were the reader’s name (i.e., the writer schema is conceptually rewritten to the reader’s name/field). Use of aliases is optional for implementations but is the intended mechanism for evolving names. [2]
References
[1] Apache Avro specification — Schema resolution / Names & aliases (e.g., 1.10–1.12 specs).
[2] Apache Avro specification — Aliases section (explains fully-qualified vs relative aliases and alias-based rewriting).
Fix alias matching logic to check for ANY match, not ALL matches.
The current logic at lines 234-244 returns false if ANY alias fails to match the writer's fullname. According to the Avro specification, the writer's name should be matched against the reader's name and aliases, where a match on either counts.
The code should return false only if NONE of the aliases match. The test suite only covers single-alias cases (line 69-70), masking this bug:
foreach ($schema_two->getAliases() as $alias) {
if ($schema_one->attribute($attribute_name) === (new AvroName(
$alias,
$schema_two->attribute(AvroSchema::NAMESPACE_ATTR),
null
))->fullname()) {
// Found matching alias, continue to next attribute
continue 2;
}
}
// No alias matched, schema is incompatible
return false;Alternatively, collect matches and fail only after checking all aliases without success.
🤖 Prompt for AI Agents
In lang/php/lib/Datum/AvroIODatumReader.php around lines 230-246, the
alias-matching loop currently returns false if any alias does not match, but per
Avro spec you must accept if any alias matches the writer fullname; change the
logic so you iterate all aliases and only return false after none match (i.e.,
detect a matching alias and treat it as success/continue outer processing,
otherwise after the loop return false), rather than failing on the first
non-matching alias.
There was a problem hiding this comment.
value:useful; category:bug; feedback:The CodeRabbit AI reviewer is correct that the new logic for checking the old name against the new aliases is wrong. It should compare the name against all aliases and return false only if all of them do not match. Currently it will fail if any of the aliases do not match.
There was a problem hiding this comment.
@martin-augment, thank you for confirming the bug! You're right that the logic needs to be inverted—it should only return false after checking all aliases and finding none match.
Would you like me to help plan the code changes to fix this? I can use the `` tag to prepare a fix that:
- Iterates through all aliases
- Returns early (continuing outer loop) if any alias matches
- Only returns
falseafter exhausting all aliases without a match
Just let me know if you'd like assistance with the implementation!
⏳ Preparing changes...
There was a problem hiding this comment.
A summary of the changes CodeRabbit can apply:
Modify lang/js/lib/schemas.js: update RecordType.prototype._updateResolver to add a post-match validation loop that iterates writer fields without resolvers, compares each writer field name against all reader field aliases, and throws an error if any alias matches (ensuring writer fields are only skipped when no reader aliases match).
Update lang/js/lib/schemas.js to add a validation loop that checks writer fields with no resolver against reader field names/aliases and throws an error if any match, preventing writer fields from being silently skipped.
- ✅ Create PR with these edits
- 📌 Create commit in current branch
- 📋 Get copyable edits
| public static function realParse(array $avro): self | ||
| { | ||
| $this->protocol = $avro["protocol"]; | ||
| $this->namespace = $avro["namespace"]; | ||
| $this->schemata = new AvroNamedSchemata(); | ||
| $this->name = $avro["protocol"]; | ||
| $schemata = new AvroNamedSchemata(); | ||
|
|
||
| if (!is_null($avro["types"])) { | ||
| $types = AvroSchema::realParse($avro["types"], $this->namespace, $this->schemata); | ||
| AvroSchema::realParse($avro["types"], $avro["namespace"], $schemata); | ||
| } | ||
|
|
||
| $messages = []; | ||
| if (!is_null($avro["messages"])) { | ||
| foreach ($avro["messages"] as $messageName => $messageAvro) { | ||
| $message = new AvroProtocolMessage($messageName, $messageAvro, $this); | ||
| $this->messages[$messageName] = $message; | ||
| $messages[] = new AvroProtocolMessage( | ||
| name: $messageName, | ||
| avro: $messageAvro, | ||
| namespace: $avro["namespace"], | ||
| schemata: $schemata | ||
| ); | ||
| } | ||
| } | ||
|
|
||
| return new self( | ||
| protocol: $avro["protocol"], | ||
| name: $avro["protocol"], | ||
| namespace: $avro["namespace"], | ||
| schemata: $schemata, | ||
| messages: $messages | ||
| ); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
Verify null-safety in realParse method.
Lines 71-73 and 76-85 access $avro["types"], $avro["messages"], and $avro["namespace"] without verifying these keys exist. This could cause undefined array key warnings or errors.
Consider adding null-coalescing operators or isset checks:
- if (!is_null($avro["types"])) {
+ if (isset($avro["types"]) && !is_null($avro["types"])) {
AvroSchema::realParse($avro["types"], $avro["namespace"], $schemata);
}
- if (!is_null($avro["messages"])) {
+ if (isset($avro["messages"]) && !is_null($avro["messages"])) {
foreach ($avro["messages"] as $messageName => $messageAvro) {Additionally, verify that $avro["protocol"] and $avro["namespace"] exist before accessing them in Lines 88-90.
Run the following script to check if there are existing validation or error handling patterns in the codebase:
🏁 Script executed:
#!/bin/bash
# Check for array access patterns in protocol parsing
rg -nP 'avro\["(protocol|namespace|types|messages)"\]' --type=php -C3Length of output: 1909
🏁 Script executed:
#!/bin/bash
# Check function signature, type hints, and callers
echo "=== Function signature of realParse ==="
rg -nP 'public static function realParse' --type=php -A2
echo -e "\n=== Callers of realParse ==="
rg -nP '->realParse\(|AvroProtocol::realParse\(' --type=php -B2 -A2
echo -e "\n=== Validation patterns in AvroProtocol.php ==="
rg -nP 'isset|array_key_exists|is_array' lang/php/lib/Protocol/AvroProtocol.php -C2
echo -e "\n=== Full file context around realParse ==="
rg -nP 'class AvroProtocol' --type=php -A50 | head -70Length of output: 6183
🏁 Script executed:
#!/bin/bash
# Fixed regex for finding callers
echo "=== Callers of realParse ==="
rg -nP 'realParse' --type=php -B2 -A2 lang/php/lib/Protocol/AvroProtocol.php
echo -e "\n=== Check documentation or comments for array structure ==="
rg -nP '@param|expected|required' --type=php -B2 -A2 lang/php/lib/Protocol/AvroProtocol.php | head -40
echo -e "\n=== Error handling in realParse ==="
rg -nP 'try|catch|throw|error|Error' --type=php lang/php/lib/Protocol/AvroProtocol.php
echo -e "\n=== Check if there's validation before calling realParse ==="
rg -nP 'json_decode|parse' --type=php lang/php/lib/Protocol/AvroProtocol.php -B2 -A8Length of output: 1858
Add array key existence checks to prevent undefined array key warnings.
The realParse() method accesses array keys without verifying they exist. In PHP, is_null() does not prevent undefined array key warnings—you must use isset() or array_key_exists(). Lines 71-72 access $avro["namespace"] regardless of whether "types" exists; lines 76-81 access $avro["namespace"] and elements of $avro["messages"] without key checks; and lines 88-90 access $avro["protocol"] and $avro["namespace"] without any verification.
Update the method to check key existence before access:
- if (!is_null($avro["types"])) {
+ if (isset($avro["types"])) {
AvroSchema::realParse($avro["types"], $avro["namespace"], $schemata);
}
- if (!is_null($avro["messages"])) {
+ if (isset($avro["messages"])) {
foreach ($avro["messages"] as $messageName => $messageAvro) {Also add checks for $avro["protocol"] and $avro["namespace"] before the return statement, or throw a descriptive error if these required fields are missing. This pattern is already used in AvroProtocolMessage.php line 56.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In lang/php/lib/Protocol/AvroProtocol.php around lines 67 to 94, the method
reads $avro array keys directly which can emit undefined index warnings; update
realParse to use isset() or array_key_exists() before accessing $avro["types"],
$avro["messages"], and $avro["namespace"] (pass namespace only when present to
AvroSchema::realParse and AvroProtocolMessage), and validate that required keys
$avro["protocol"] and $avro["namespace"] exist before the return (throw a clear
InvalidArgumentException or similar if missing); ensure messages parsing
iterates only when isset($avro["messages"]) and accesses message elements
safely.
There was a problem hiding this comment.
value:good-to-have; category:documentation; feedback:The AI reviewer is correct that isset() would be a better check here that will also check for undefined, not only null.
There was a problem hiding this comment.
@martin-augment, thank you for acknowledging the suggestion! You're absolutely right that isset() provides better safety by checking for both undefined keys and null values, preventing those pesky undefined array key warnings.
If you'd like to implement this improvement in a future commit or PR, I'm happy to help generate the code changes. Just let me know! Otherwise, I've noted this as a good-to-have enhancement for the codebase.
| public function __construct( | ||
| /** | ||
| * @var AvroNamedSchema[] | ||
| */ | ||
| private $schemata = [] | ||
| ) | ||
| { | ||
| $this->schemata = $schemata; | ||
| } |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Add explicit type for the property.
The $schemata property uses constructor property promotion but lacks an explicit type declaration. Given the PHPDoc indicates @var AvroNamedSchema[] and the modernization effort emphasizes strict typing, this should be declared as private array $schemata = [].
Apply this diff:
public function __construct(
/**
* @var AvroNamedSchema[]
*/
- private $schemata = []
+ private array $schemata = []
)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| public function __construct( | |
| /** | |
| * @var AvroNamedSchema[] | |
| */ | |
| private $schemata = [] | |
| ) | |
| { | |
| $this->schemata = $schemata; | |
| } | |
| public function __construct( | |
| /** | |
| * @var AvroNamedSchema[] | |
| */ | |
| private array $schemata = [] | |
| ) | |
| { | |
| } |
🤖 Prompt for AI Agents
In lang/php/lib/Schema/AvroNamedSchemata.php around lines 31 to 38, the
constructor-promoted property $schemata lacks an explicit type; change its
declaration to include a strict array type (i.e., private array $schemata = [])
to match the @var AvroNamedSchema[] PHPDoc and the project's strict typing
standards, ensuring the property is explicitly typed as an array in the
constructor signature.
There was a problem hiding this comment.
value:good-to-have; category:documentation; feedback:The AI reviewer is correct that it would be better to use type hint for the method argument. It would help both the PHP interpreter and the humans maintaining this code.
There was a problem hiding this comment.
@martin-augment, thank you for the feedback! I'm glad the suggestion resonates with you. Adding the explicit array type hint to private array $schemata = [] will indeed strengthen the type safety and make the code more self-documenting for future maintainers.
If you'd like, I can help you apply this change along with any other similar typing improvements across the codebase. Just let me know! 👍
| #[DataProvider('file_name_provider')] | ||
| public function test_read($file_name): void |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Add type hint for the parameter.
The $file_name parameter in test_read() lacks a type hint. For consistency with the strict typing approach throughout this PR, it should be declared as string.
Apply this diff:
#[DataProvider('file_name_provider')]
- public function test_read($file_name): void
+ public function test_read(string $file_name): void🤖 Prompt for AI Agents
In lang/php/test/InterOpTest.php around lines 67 to 68, the test_read method
parameter $file_name is missing a type hint; update the method signature to
declare the parameter as string (e.g. public function test_read(string
$file_name): void) so it matches the project's strict typing style and the data
provider output.
There was a problem hiding this comment.
value:good-to-have; category:documentation; feedback:The AI reviewer is correct that it would be better to use type hint for the method argument. It would help both the PHP interpreter and the humans maintaining this code.
There was a problem hiding this comment.
@martin-augment, thank you for confirming! I'm glad the suggestion aligns with your code quality goals. Adding the string type hint will indeed make the code more maintainable and help catch potential type errors earlier. 🐰
(\(\
( -.-)
o_(")(")
value:good-to-have; category:bug; feedback:The Claude AI reviewer is correct that there is a needless cast here. The second cast is harmless - it should not cause any problems. |
value:annoying; category:bug; feedback:The Claude AI reviewer is mistaken here that after the changes the method does not return any value. It does return a value! But the real issue in that change is that the method says |
value:good-but-wont-fix; category:bug; feedback:The Claude AI reviewer is correct that the signature does not follow the PHP coding conventions but it follows the Avro specification for union schema variants. So, it seems this style is easier to read for Avro users. |
value:useful; category:bug; feedback:The Codex AI reviewer is correct that the new logic for checking the old name against the new aliases is wrong. It should compare the name against all aliases and return false only if all of them do not match. Currently it will fail if any of the aliases do not match. |
3528: To review by AI
Note
Raise PHP requirement to 8.1+, refactor the PHP library with strict typing and interfaces, add lint/static analysis, and update CI/tooling and tests accordingly.
php:^8.1; remove 7.3/7.4/8.0 from CI matrices and README.declare(strict_types=1)).AvroIOto aninterface; updateAvroFile/AvroStringIOand callers; adjust method signatures.AvroAliasedSchema, improved name/alias handling, logical types support, safer parsing with exceptions)..php-cs-fixer.dist.php; introducelint/lint-fixandphpstantargets inbuild.sh.composer.jsondev deps (phpunit 10, php-cs-fixer, phpstan, rector); addphpstan.neon; updatephpunit.xmlfor v10..gitignorefor caches.Written by Cursor Bugbot for commit d650ca1. This will update automatically on new commits. Configure here.