Skip to content
Closed
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -28,3 +28,6 @@ __pycache__/

# go build output
/_output

# macOS
*/.DS_Store
33 changes: 33 additions & 0 deletions docs/proposals/EntGAN.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Integrate GAN and Self-taught Learning into Sedna Lifelong Learning to Handle Unknown Tasks

## Motivation

In the process of Sedna lifelong learning, there would be a chance to confront unknown tasks, whose data are always heterogeneous small sample. Generate Adversarial Networks(GAN) is the start-of-art generative model and GAN can generate fake data according to the distribution of the real data. Naturally, we try to utilize GAN to handle small sample problem. Self-taught learning is an approach to improve classfication performance using sparse coding to construct higher-level features with the unlabeled data. Hence, we combine GAN and self-taught learning to help Sedna lifelong learning handle unknown tasks.

### Goals
Copy link
Collaborator

@MooreZheng MooreZheng Oct 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the previous discussion in #337 (comment)

The story is not yet completed

  1. How would we solve the small data problem in lifelong learning
    0) lifelong learning limitation: lifelong learning tackles the small data issue by incrementally training with label data. But labeled data is labor-intensive and its collection is time-consuming.
    1. The proposal reduces the time for data collection. We generate data by GAN instead of real-world data collection.
    2. The proposal reduces the intensive labor. We leveraged self-taught learning to eliminate the labeling job.
  2. it would be much improved with the targeting scenario and dataset added, i.e., semantic segmentation and Cityscape.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current version does not express the limitation of the current lifelong learning.

  1. The current lifelong learning is designed to tackle small sample problems.
  2. Why do we still need a GAN? GAN cannot generate labeled data so far. Why not just set a camera on a car and collect more data?
    The author might want to consider adding a related story mentioned in https://github.com/kubeedge/sedna/pull/337/files#r997678113.


* Handle unknown tasks
* Implement of a lightweight GAN to solve small sample problem
* Utilize self-taught learning to solve heterogeneous problem

## Proposal
We focus on the process of handling unknown tasks.

The overview is as follows:

![](images/EntGAN%20overview.png)

The process is illustrated as below:
1. GAN exploits the unknown task sample to generate more fake sample.
2. Self-taught learning unit utilize the fake sample and orginal unknown task sample and its label to train a classifier.
3. A well trained classifier is output.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the targeting scenario and dataset?

### GAN Design
We use the networks design by [TOWARDS FASTER AND STABILIZED GAN TRAINING FOR HIGH-FIDELITY FEW-SHOT IMAGE SYNTHESIS](https://openreview.net/forum?id=1Fqg133qRaI). The design is aimed for small training data and pour computing devices. Therefore, it is perfectly suitable for handling unkwnon tasks of Sedna lifelong learning. The network is shown below [GAN Desin](images/EntGAN%20GAN.png).

![](images/EntGAN%20GAN.png)
Copy link
Collaborator

@MooreZheng MooreZheng Oct 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The architecture is needed for the proposal. We see that the GAN is now put in the unseen task processing. It would be better to show the overall architecture to let the user know which scheme it belongs (i.e., lifelong learning), not only the unseen task processing component.

See previous comment: #337 (comment)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet resolved


### Self-taught Learing Design
Self-taught learning uses unlabeled data to find the latent feature of data and then makes every labeled data a represention using the latent feature and uses the represention and label corresponding to train classifier.

![](images/EntGAN%20self-taught%20learning.png)
Binary file added docs/proposals/images/EntGAN GAN.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/proposals/images/EntGAN overview.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/proposals/images/entgan_system.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.