Skip to content
Merged
Show file tree
Hide file tree
Changes from 18 commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
b451a83
Add C implementation of date.fromisoformat
pganssle Dec 1, 2017
4ccef0a
Add python implementation of date.fromisoformat
pganssle Dec 1, 2017
13e8e48
Add tests for date.fromisoformat
pganssle Dec 1, 2017
6fa5c55
Implement C version of datetime.fromisoformat
pganssle Dec 2, 2017
0d44220
Add initial test suite for C-only datetime.fromisoformat
pganssle Dec 2, 2017
327d0fc
Add C implementation of time.fromisoformat()
pganssle Dec 4, 2017
52b2175
Add tests for time.isoformat()
pganssle Dec 4, 2017
f1b78af
Add pure python implementation of time.fromisoformat()
pganssle Dec 4, 2017
aeaa9ca
Add tests for pure python time.fromisoformat()
pganssle Dec 4, 2017
094ccf4
Add pure python implementation of datetime.fromisoformat
pganssle Dec 4, 2017
2a8120d
Enable tests for pure python datetime.fromisoformat
pganssle Dec 4, 2017
af9e6d0
Add documentation for [date][time].fromisoformat()
pganssle Dec 4, 2017
cf802af
Consolidate helper functions into parse_digits
pganssle Dec 5, 2017
626d239
Refactor datetime.isoformat round trip tests
pganssle Dec 5, 2017
7c771e7
Refactor C code for PEP 7
pganssle Dec 5, 2017
8fbd752
Add support for seconds in fromisoformat offsets
pganssle Dec 6, 2017
4d55e05
Fix pure python implementation of isoformat() for sub-second zones
pganssle Dec 9, 2017
5a233fb
Add support for subsecond offsets to fromisoformat
pganssle Dec 9, 2017
9fa91db
Fix documentation and pure python error catching in fromisoformat
pganssle Dec 18, 2017
ffdb2af
Drop unsupported sep parameter in _tzstr
pganssle Dec 18, 2017
18a5fa8
Add test for ambiguous isoformat strings
pganssle Dec 18, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 46 additions & 1 deletion Doc/library/datetime.rst
Original file line number Diff line number Diff line change
Expand Up @@ -436,6 +436,21 @@ Other constructors, all class methods:
d``.


.. classmethod:: date.fromisoformat(date_string)

Return a :class:`date` corresponding to a *date_string* in one of the ISO 8601
formats emitted by :meth:`date.isoformat`. Specifically, this function supports
strings in the format(s) ``YYYY-MM-DD``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is only one format emitted by date.isoformat and supported by your date.fromisoformat, as far as I know.


.. caution::

This does not support parsing arbitrary ISO 8601 strings - it is only intended
as the inverse operation of :meth:`date.isoformat`.

.. versionadded:: 3.7



Class attributes:

.. attribute:: date.min
Expand Down Expand Up @@ -819,6 +834,20 @@ Other constructors, all class methods:
Added the *tzinfo* argument.


.. classmethod:: datetime.fromisoformat(date_string)

Return a :class:`datetime` corresponding to a *date_string* in one of the
ISO 8601 formats emitted by :meth:`datetime.isoformat`. Specifically, this function
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would drop “ISO 8601” and just say “one of the formats emitted by isoformat”. As far as I understand, ISO 8601 doesn’t have a seconds field in time zones, but it seems you want to support this.

If you intend to support dates without the time part, maybe write “emitted by datetime.isoformat and date.isoformat”. This may also need a test case; I didn’t notice anything relevant.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vadmium The test cases from TestDate are actually inherited by TestDateTime, so it is indeed supported. I think it's fair to support them.

supports strings in the format(s) ``YYYY-MM-DD[*HH[:MM[:SS[.mmm[mmm]]]]][+HH:MM[:SS[.ffffff]]]``,
where ``*`` can match any single character.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the colon after :MM be optional?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be less ambiguous putting the time zone inside the optional time part. Test case:

datetime(2017, 12, 18, 11, 0).isoformat(sep="+", timespec="minutes") -> "2017-12-18+11:00"
datetime.fromisoformat("2017-12-18+11:00") -> datetime(2017, 12, 18, 11, 0)

Copy link
Member Author

@pganssle pganssle Dec 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm.. As much as I dislike the idea of timezone being part of the time component, this is fine I suppose (and in fact that is how it is currently implemented).

That said, another design I considered is one where we take sep as a keyword argument to isoparse, which would relieve this ambiguity for all separators other than - and + if we wanted to eventually allow parsing of strings of the format YYYY-MM-DD+HH:MM. That's one decision we'd have to make in this version because changing it would not be backwards compatible.


.. caution::

This does not support parsing arbitrary ISO 8601 strings - it is only intended
as the inverse operation of :meth:`datetime.isoformat`.

.. versionadded:: 3.7

.. classmethod:: datetime.strptime(date_string, format)

Return a :class:`.datetime` corresponding to *date_string*, parsed according to
Expand Down Expand Up @@ -1486,6 +1515,23 @@ In boolean contexts, a :class:`.time` object is always considered to be true.
error-prone and has been removed in Python 3.5. See :issue:`13936` for full
details.


Other constructor:

.. classmethod:: time.fromisoformat(date_string)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

time_string?


Return a :class:`time` corresponding to a *time_string* in one of the ISO 8601
formats emitted by :meth:`time.isoformat`. Specifically, this function supports
strings in the format(s) ``HH[:MM[:SS[.mmm[mmm]]]]][+HH:MM[:SS[.ffffff]]]``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too many square brackets.


.. caution::

This does not support parsing arbitrary ISO 8601 strings - it is only intended
as the inverse operation of :meth:`time.isoformat`.

.. versionadded:: 3.7


Instance methods:

.. method:: time.replace(hour=self.hour, minute=self.minute, second=self.second, \
Expand Down Expand Up @@ -1587,7 +1633,6 @@ Instance methods:
``self.tzinfo.tzname(None)``, or raises an exception if the latter doesn't
return ``None`` or a string object.


Example:

>>> from datetime import time, tzinfo, timedelta
Expand Down
201 changes: 173 additions & 28 deletions Lib/datetime.py
Original file line number Diff line number Diff line change
Expand Up @@ -173,6 +173,24 @@ def _format_time(hh, mm, ss, us, timespec='auto'):
else:
return fmt.format(hh, mm, ss, us)

def _format_offset(off):
s = ''
if off is not None:
if off.days < 0:
sign = "-"
off = -off
else:
sign = "+"
hh, mm = divmod(off, timedelta(hours=1))
mm, ss = divmod(mm, timedelta(minutes=1))
s += "%s%02d:%02d" % (sign, hh, mm)
if ss or ss.microseconds:
s += ":%02d" % ss.seconds

if ss.microseconds:
s += '.%06d' % ss.microseconds
return s

# Correctly substitute for %z and %Z escapes in strftime formats.
def _wrap_strftime(object, format, timetuple):
# Don't call utcoffset() or tzname() unless actually needed.
Expand Down Expand Up @@ -237,6 +255,102 @@ def _wrap_strftime(object, format, timetuple):
newformat = "".join(newformat)
return _time.strftime(newformat, timetuple)

# Helpers for parsing the result of isoformat()
def _parse_isoformat_date(dtstr):
# It is assumed that this function will only be called with a
# string of length exactly 10, and (though this is not used) ASCII-only
year = int(dtstr[0:4])
if dtstr[4] != '-':
raise ValueError('Invalid date separator: %s' % dtstr[4])

month = int(dtstr[5:7])

if dtstr[7] != '-':
raise ValueError('Invalid date separator')

day = int(dtstr[8:10])

return [year, month, day]

def _parse_hh_mm_ss_ff(tstr):
# Parses things of the form HH[:MM[:SS[.fff[fff]]]]
len_str = len(tstr)

time_comps = [0, 0, 0, 0]
pos = 0
for comp in range(0, 3):
if (len_str - pos) < 2:
raise ValueError('Incomplete time component')

time_comps[comp] = int(tstr[pos:pos+2])

pos += 2
next_char = tstr[pos:pos+1]

if not next_char or comp >= 2:
break

if next_char != ':':
raise ValueError('Invalid time separator: %c' % next_char)

pos += 1

if pos < len_str:
if tstr[pos] != '.':
raise ValueError('Invalid microsecond component')
else:
pos += 1

len_remainder = len_str - pos
if len_remainder not in (3, 6):
raise ValueError('Invalid microsecond component')

time_comps[3] = int(tstr[pos:])
if len_remainder == 3:
time_comps[3] *= 1000

return time_comps

def _parse_isoformat_time(tstr):
# Format supported is HH[:MM[:SS[.fff[fff]]]][+HH:MM[:SS[.ffffff]]]
len_str = len(tstr)
if len_str < 2:
raise ValueError('Isoformat time too short')

# This is equivalent to re.search('[+-]', tstr), but faster
tz_pos = (tstr.find('-') + 1 or tstr.find('+') + 1)
timestr = tstr[:tz_pos-1] if tz_pos > 0 else tstr

time_comps = _parse_hh_mm_ss_ff(timestr)

tzi = None
if tz_pos > 0:
tzstr = tstr[tz_pos:]

# Valid time zone strings are:
# HH:MM len: 5
# HH:MM:SS len: 8
# HH:MM:SS.ffffff len: 15

if len(tzstr) not in (5, 8, 15):
raise ValueError('Malformed time zone string')

tz_comps = _parse_hh_mm_ss_ff(tzstr)
if all(x == 0 for x in tz_comps):
tzi = timezone.utc
else:
tzsign = -1 if tstr[tz_pos - 1] == '-' else 1

td = timedelta(hours=tz_comps[0], minutes=tz_comps[1],
seconds=tz_comps[2], microseconds=tz_comps[3])

tzi = timezone(tzsign * td)

time_comps.append(tzi)

return time_comps


# Just raise TypeError if the arg isn't None or a string.
def _check_tzname(name):
if name is not None and not isinstance(name, str):
Expand Down Expand Up @@ -732,6 +846,19 @@ def fromordinal(cls, n):
y, m, d = _ord2ymd(n)
return cls(y, m, d)

@classmethod
def fromisoformat(cls, date_string):
"""Construct a date from the output of date.isoformat()."""
if not isinstance(date_string, str):
raise TypeError('fromisoformat: argument must be str')

try:
assert len(date_string) == 10
return cls(*_parse_isoformat_date(date_string))
except:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Write except Exception, to avoid catching KeyboardInterrupt or similar. Or even better, be explicit and list the exceptions you are expecting (AssertionError, ValueError, IndexError?).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vadmium I think catching Exception is probably the right thing to do, I didn't think about KeyboardInterrupt. I'm mainly trying to get the C implementation and the Python implementation to always raise the same exceptions, and the C implementation only ever raises ValueError (I will fuzz it some time this week to verify this), hence the catch-and-re-raise.

raise ValueError('Invalid isoformat string: %s' % date_string)


# Conversions to string

def __repr__(self):
Expand Down Expand Up @@ -1193,19 +1320,7 @@ def __hash__(self):
def _tzstr(self, sep=":"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good opportunity to remove the unsupported sep parameter

"""Return formatted timezone offset (+xx:xx) or None."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or the empty string

off = self.utcoffset()
if off is not None:
if off.days < 0:
sign = "-"
off = -off
else:
sign = "+"
hh, mm = divmod(off, timedelta(hours=1))
mm, ss = divmod(mm, timedelta(minutes=1))
assert 0 <= hh < 24
off = "%s%02d%s%02d" % (sign, hh, sep, mm)
if ss:
off += ':%02d' % ss.seconds
return off
return _format_offset(off)

def __repr__(self):
"""Convert to formal string, for repr()."""
Expand Down Expand Up @@ -1244,6 +1359,18 @@ def isoformat(self, timespec='auto'):

__str__ = isoformat

@classmethod
def fromisoformat(cls, time_string):
"""Construct a time from the output of isoformat()."""
if not isinstance(time_string, str):
raise TypeError('fromisoformat: argument must be str')
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really sure why this is not getting hit by the test suite. test_fromisoformat_fails_typeerror is designed explicitly to hit this condition. Anyone have an idea why it's getting missed?


try:
return cls(*_parse_isoformat_time(time_string))
except:
raise ValueError('Invalid isoformat string: %s' % time_string)


def strftime(self, fmt):
"""Format using strftime(). The date part of the timestamp passed
to underlying strftime should not be used.
Expand Down Expand Up @@ -1497,6 +1624,31 @@ def combine(cls, date, time, tzinfo=True):
time.hour, time.minute, time.second, time.microsecond,
tzinfo, fold=time.fold)

@classmethod
def fromisoformat(cls, date_string):
"""Construct a datetime from the output of datetime.isoformat()."""
if not isinstance(date_string, str):
raise TypeError('fromisoformat: argument must be str')

# Split this at the separator
dstr = date_string[0:10]
tstr = date_string[11:]

try:
date_components = _parse_isoformat_date(dstr)
except ValueError:
raise ValueError('Invalid isoformat string: %s' % date_string)

if tstr:
try:
time_components = _parse_isoformat_time(tstr)
except ValueError:
raise ValueError('Invalid isoformat string: %s' % date_string)
else:
time_components = [0, 0, 0, 0, None]

return cls(*(date_components + time_components))

def timetuple(self):
"Return local time tuple compatible with time.localtime()."
dst = self.dst()
Expand Down Expand Up @@ -1673,18 +1825,10 @@ def isoformat(self, sep='T', timespec='auto'):
self._microsecond, timespec))

off = self.utcoffset()
if off is not None:
if off.days < 0:
sign = "-"
off = -off
else:
sign = "+"
hh, mm = divmod(off, timedelta(hours=1))
mm, ss = divmod(mm, timedelta(minutes=1))
s += "%s%02d:%02d" % (sign, hh, mm)
if ss:
assert not ss.microseconds
s += ":%02d" % ss.seconds
tz = _format_offset(off)
if tz:
s += tz

return s

def __repr__(self):
Expand Down Expand Up @@ -2275,9 +2419,10 @@ def _name_from_offset(delta):
_check_date_fields, _check_int_field, _check_time_fields,
_check_tzinfo_arg, _check_tzname, _check_utc_offset, _cmp, _cmperror,
_date_class, _days_before_month, _days_before_year, _days_in_month,
_format_time, _is_leap, _isoweek1monday, _math, _ord2ymd,
_time, _time_class, _tzinfo_class, _wrap_strftime, _ymd2ord,
_divide_and_round)
_format_time, _format_offset, _is_leap, _isoweek1monday, _math,
_ord2ymd, _time, _time_class, _tzinfo_class, _wrap_strftime, _ymd2ord,
_divide_and_round, _parse_isoformat_date, _parse_isoformat_time,
_parse_hh_mm_ss_ff)
# XXX Since import * above excludes names that start with _,
# docstring does not get overwritten. In the future, it may be
# appropriate to maintain a single module level docstring and
Expand Down
Loading