Monday, March 11, 2019

Garbage collection



Terms:
  • Young generation (Minor GC). 
    • Eden: 
      • New objects gets allocated here. 
    • From survivor
      • Once eden space is full, minor gb occurs and moves from from to to survivor. 
    • To Survivor
  • Old generation (Tenured) (Major GC) 
    • Oldest objects to old generation
  • Perm generation 
Note: 
  • SerialGen operates on the young gen. 
  • Concurrent gen operates on the old gen. 
-XX:+PrintClmmandLineFlags -version TestSystemGC

Concepts: 
Minor GC:

Garbage collection algorithms,
  • MinorJS: 
    • Serial Copy collector:  -XX:+UseSerialGC
      • -XX:+UseSerialGC
      • When to use: 
    • Parallell Scavenge Collector: -XX+UseParallelGC
    • Parallel copy collector: -XX:+UseParNewGC
    • Garbage first collector: -XX:+UseG1GC
  • Major JC:
    • Marksweep compact collector: -XX:+UseSerialGC
    • Parallell scavenge marksweep collector: -XX:+UseConcMarkSweepGC
    • Gargbage first collector: -XX:+UseG1GC

Full GC:
  • When full GC occurs, the Full GC pause time is normally lengthier. 

What you need to know
  • Frequency in which minor GC. 
    • Object allocation rate
    • Size of the eden space. 
  • Frequency of object promotion to old generation
    • Frequency of minor GC (how quickly objects age)
    • Size of survivor space. 
Steps: 
  • Check the frequency of Minor GC: (Defined by allocation rate and size of Eden)
    • Higher allocation rate and smaller eden space: More frequent minor GC
    • Lower allocation rate or larger eden space: less frequent minor GC. 
  • Check the frequency of Full GC: (Defined by promotion rate and size of old generation space)
    • For Parallel GC 
      • Higher promotion rate and small old generation space: frequent full GC
      • Lower promotion rate and larger old men: less frequent full GC.
    • For CMS & G1
  • G1: 
    • Avoids fragmentation. 
  • CMS: 

Notes:
  • The longer the object lives, the greater the impact on throughput latency and footprint
  • Object retention can degrade performance more than object allocation. 
  • GC visits only visits live objects. GCs love small immutable objects. 
  • Object allocation is very cheap (10 CPU instructions). 


Takeaways:
  • Its better to use short lived immutable objects vs long lived mutable objects. 
  • Start with -XX:+UseParallelOldGC and avoid full GC. 
  • If there are frequent full GC then move to CMS or G1 if needed (for old gen collections). 

Best practices: 
  • Avoid data structure resizing. 
  • Avoid large allocation. 
  • Dont do finalizers (requires GC cycles and GC cycles are slower), if required use reference objects as an alternatives. 
  • Soft references (don’t do it). 

12-factor apps


#1. One codebase, 1 application
  • Code base to be backed by a version control system such as Git, Subversion etc., 
  • 1 code base = 1 app. 
  • Multiple apps sharing the same code is a violation of the 12-factor. 


#2. Dependencies

Manage dependencies in your application manifest. 
  • Database, image processing libraries. 
Characteristics:
  • Dependencies are managed in the app manifests such as maven, gradle etc., 

What should you do or don’t do: 
  • Don’t use pre-installed softwares which will help in each environments. This will not automate the deployment. 
  • Don’t assume that the related dependencies will be in the environment where you deploy, you are responsible for wiring the dependencies. 

#3. Externalize configuration

Application configuration referred here are values that are
  • Credentials to access a database, or services such as S3. 
  • Environment specific properties

Motivation: 
  • Env properties stored in the code will be a violation since the code has to be redeployed whenever the properties changes. 
  • Also its not a good idea to store env values in the code.
#4. Backing services

Are services that the app talks to 

Characteristics
  • Services can be attached or detached whenever required. 

Objective:
  • A different SMTP server should be able to be configured without the need for code change. 

#5. Build release run

Summary: 
Build: Application code gets converted into an artifact such as a war file. 
Release: the artifact now understands the environment - QA, Dev, Prod
Run: the app gets released in the specific environment. 

Characteristics
  • Strict separation between the build, release and run stages of the application. 
  • Every release must have a release id such as a timestamp or a incrementing number
  • 1-click release - using CI/CD tool. 
What should you do: 
  • Use a CI/CD tool, Jenkins, Concourse etc.,
Best Practices:
  • Create smoke tests to ensure the app is running fine after deployment. 
#6. Stateless Processes

Characteristics:
  • The process running in the environment should be stateless. 
  • Should not share anything with other process. 
  • No sticky sessions
What should do or don’t do. 
  • Don’t store any data in a local file system. 
  • Sessions: 
    • Store your sessions if any in a distributed session storage db, such as Redis. 
    • No sticky sessions. 
  • Create stateless services. 

#7. Port binding

#8. Concurrency
#9. Disposability


Your application should work fine even though one or more of app instances die. 

#10. Dev/Prod parity

Keep the developer environment as close as possible to the production environment. 


#11. Logs

Ensure it is easy to view / debug logs even though there may be multiple instances of the services exists. 


What should you do or don’t do:
  • No System.out.printlns
  • Write to log files on the web server using log4j like frameworks. 
  • Stream logs to an external server where it can be viewed easily. Send the logs to a centralized logging facility. 
#12. Admin processes

Characteristics: 
  • Execute any migration scripts on deployment of a startup if any. 
What should you do or don’t do: 
  • Store migration scripts in the repository. 

Externalizing configurations.




Motivation why to externalize configuration:
  • Externalize config data from the applications across all environments. 
  • Store crucial information such as password for DB in a centralized place. 
  • Share data between different services. 
  • See who changed which property and when and why. 
  • Build once deploy anywhere. 

What is Cloud Config server: 

Its a way by which we can externalize the property values by moving out of the project. Primarily kept for properties that you normally override or keep secret values out of the project src. 
What should you do to use Spring cloud config
  1. Application configuration: 
    • Manage app configuration in a git hub or a local file system. 
  2. Config server
    • Point the location of GIT or local file system where the app config is located (i.e., GIT URI)
    • @EnableConfigServer
  3. Config client
    • Use it normally as how you would read from the properties file 
      • @Value or 

Features
  • Many storage options including Git, Subversion. 
  • Pull model
  • Traceability
  • @RefreshScope for allowing properties to refresh. Calls the constructor of the beans and reinitializes it. 
  • @ConfigurationProperties alternate for @RefreshScope which does reinitialize the Bean that is annotated with @ConfigurationProperties. 
  • Encrypt and decrypt property values. 

Usecases for Cloud config: 
  • Change log level in the properties. 
  • Environment config - containing passwords. 
  • Toggle a feature on or off. 
  • Using @RefreshScope
  • @ConfigurationProperties. 

Order in which the Cloud config looks into the properties files. 
  • Config server. 
  • Command line parameters
  • System properties.
  • Classpath: application.yml file
  • Classpath: bootstrap.yml

Things that you can try: 
  • Simple example to demonstrate properties in Git and reading from config server and client 
  • Using maven profiles specific properties in Config server in Github. 
  • Managing log levels in the Config server
  • Using @RefreshScope and @CofigurationProperties. 

Best practices: 
  • Config server communication with the Git hub, pull every time when client requests for properties file. Use a Jenkins job which will push the properties to a Config server properties (clone). 
  • Single repository for all teams, subdirectories for different environments. 
  • Source code maintenance, having branch per release vs always using the master. 
  • Managing encrypted values for secure data, using JWT with Config server.
  • Use Spring cloud bus to refresh all the services which listens to the rabbitmq.

Note: 
  • Store only things that need to be externalized, I.e., Resource bundles for internationalization should be at the project level.