Skip to content

Question on this 30x scaling tool #3

@BradKML

Description

@BradKML
  1. Assuming this can generate new architectures, can they merge different techniques to see if there are synergistic gains, kinda like the methods of DeepSeek or Qwen iterations? Sometimes certain features are non-compatible and they need a bit of outlining
  2. Have you tested out-of-sample (or outside of training data) tasks such that LLMs can be evaluated based on acceleration of more complex emergent properties? For example, check new small models on diverse hard tasks usually reserved for larger models.
  3. Can this system do replication studies to prevent flukes? What about understanding how certain advancements work through deep investigations and research? How could that be turned into a separate metric (maybe an ELI5 meta-judge)?
  4. Would it be better if this project were PKM-compatible, such that humans can read AI-generated research notes and self-studies? That way it might be able to organize thoughts in more efficient ways.
  5. Consider NanoPoor by @VatsaDev (and similar projects), would ASI-Arch be good for small-scale citizen research with less GPU compute?

P.S. The 30x estimation comes from 4000 hr mark when all the exponential optimization has been squeezed out, and the architecture count goes linear. Kinda weird but something worth looking into

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions