Skip to content
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 10 additions & 2 deletions log_config.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
version: 1
disable_existing_loggers: True
disable_existing_loggers: False

formatters:
default:
Expand All @@ -12,7 +12,15 @@ handlers:
level: DEBUG
formatter: default
stream: ext://sys.stderr
file_rotating:
class: logging.handlers.TimedRotatingFileHandler
level: DEBUG
filename: ictrl_log.log
formatter: default
when: 'midnight'
interval: 1
backupCount: 14
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Consider enhancing the rotating file handler configuration.

While the basic setup is functional, consider the following improvements:

  1. Use absolute path or environment variable for log file location
  2. Add file permissions configuration
  3. Configure error handling for file operations
  4. Add size limits to prevent disk space issues

Example enhancement:

     file_rotating:
         class: logging.handlers.TimedRotatingFileHandler
         level: DEBUG
-        filename: ictrl_log.log
+        filename: ${ICTRL_LOG_DIR}/ictrl_log.log
         formatter: default
         when: 'midnight'
         interval: 1
         backupCount: 14
+        encoding: utf-8
+        delay: true
+        mode: 0o644
+        maxBytes: 52428800  # 50MB

Committable suggestion skipped: line range outside the PR's diff.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How did we arrive at this count? Do we have an estimate of how many logs iCtrl generates per hour? Then how many (bytes of) logs do we plan to keep on the user's machine? Note this should not exceed the application environment requirements in your previous analysis, say in the Project Proposal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially, I just thought it was a good idea to keep a backup for the past 2 weeks' logs. However let's do some calculation given what we wrote in the project proposal.

We are using the following format for our log message:
'%(asctime)s %(levelname)s [%(name)s:%(lineno)d] %(message)s'.

Let's make some very bad assumptions for each value to estimate the length of a very long log message.

  • %(asctime)s: This value is fixed to be 23 characters
  • %(levelname)s: the longest level name is CRITICAL, which is 8 characters
  • %(name)s: Assume the longest Python file name to be 20 characters
  • %(lineno)d: Assume the longest line number to be 5 digits(characters)
  • %(message)s: Assume a very detailed log message that has 200 characters
  • Other characters such as space sum up to a total of 6 characters, and a newline character consists of CR and LF which adds 2 to 6 becomes 8 characters

In total, there are 264 characters, so each message has 528 bytes assuming each character is 2 bytes according to the UTF standard, which by default is 2 bytes each.

Assuming 5 log messages are generated each second, which is 2640 bytes/second. Then each 3-hour period will generate 60 x 60 x 3 x 2640 = 28,512,000 bytes of data which is 27.1912 MB. Let's round it up to 28 MB to include some metadata generated from file rotation and make the calculation easier.

The requirement on the project proposal said that the size of the logs generated by iCtrl cannot exceed 32GB. Originally we are backing up 2 weeks of data, that is 112 rotation periods. The total size for log backups would be 112 x 28 = 3,136 MB = 3.0625 GB, which is far from breaking the constraint requirement.

Therefore, I think 2 weeks of backup data is appropriate, and the calculations above show that the system can allow more backups, if the assumptions are bad enough.

Copy link
Owner

@junhaoliao junhaoliao Mar 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think there can be some pitfalls in the derivation, though the difference likely won't be more than 50%.

each character is 2 bytes according to the UTF standard, which by default is 2 bytes each.

which UTF standard are we referring to? if it's UTF-16, where is it configured and why we want to configure so, given iCtrl only outputs English logs?

%(name)s: Assume the longest Python file name to be 20 characters

20 characters are still too small. Even with rigid standards such as MISRA 2012, identifiers names are regulated to have max 31 characters. I think we can make worst case assumptions. On most file systems, max file name length is 255 characters.

%(message)s: Assume a very detailed log message that has 200 characters

this is fine. people often use 140 which is an old limit of Tweets.

The original comment meant to say the derivation should be somewhere documented publicly. If you think it can be too many comments to add in the YAML file, at least we want to show a link of this thread in a comment before the number.

Copy link
Contributor Author

@li-ruihao li-ruihao Mar 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which UTF standard are we referring to? if it's UTF-16, where is it configured and why we want to configure so, given iCtrl only outputs English logs?

Thank you for raising this up. My calculation was an estimation of the worst case so I wanted to show that it wouldn't be much of a performance issue even for the worst case. In our case for iCtrl, I believe we would only use UTF-8 because we only log in English letters, so we can proceed with the calculation of 1 byte/character.

20 characters are still too small. Even with rigid standards such as MISRA 2012, identifiers names are regulated to have max 31 characters. I think we can make worst case assumptions. On most file systems, max file name length is 255 characters.

Yes on most file system the maximum file name length is 255 characters. I believe the reason for that assumption was due to the naming convention for our logs, and I was also assuming no one would change the names manually. We can proceed with assuming 255 characters / file name in the worst case.

this is fine. people often use 140 which is an old limit of Tweets.

Surely, however I just found from Log and event storage best practices that event log entries usually average around 200 bytes, so I think it is reasonable to assume every log message to be around 200 characters.

The original comment meant to say the derivation should be somewhere documented publicly. If you think it can be too many comments to add in the YAML file, at least we want to show a link of this thread in a comment before the number.

Yea I think adding a link to the yaml file would be better. I will go ahead and do it.

The following is a recalculation based on the above items:

number of characters in a message * number of bytes/character * number of messages logged / second * 3 hours * 112 rotation periods(2 weeks)

= (23 + 8 + 255 + 5 + 200 + 8) * 1 * 5 * (60 * 60 * 3) * 112
= 2.81 GB

Still far from breaking the constraint requirement.


root:
level: DEBUG
handlers: [console]
handlers: [console, file_rotating]
Loading