Monday, March 17, 2014

Word / Sentence detector using opennlp

OpenNLP is a machine learning toolkit used for processing NLP. This article focuses on setting up a simple maven project and runs a simple program using OpenNLP:

Add the following in the maven configuration:
              <!-- Open NLP -->
              <dependency>
                  <groupId>org.apache.opennlp</groupId>
                  <artifactId>opennlp-tools</artifactId>
                  <version>1.5.3</version>
              </dependency>
             
              <dependency>
                  <groupId>commons-io</groupId>
                  <artifactId>commons-io</artifactId>
                  <version>2.4</version>
              </dependency>

Write a java program for getting sentences:

Following are the high level steps:
  • Get a reference of the model en-sent.bin using the InputStream
  • Create Sentence model and SentenceDetector for the input model stream.
  • Get the sentence array using open nlp api


public class SentenceDetectorClient {

       public static void main(String[] args) {
              new SentenceDetectorClient().go();

       }

       private void go() {
              try {

                     InputStream modelIn = new FileInputStream("src/main/resources/models/en-sent.bin");  // --- Import en-sent.bin feil for sentence mich
                     SentenceModel sModel = new SentenceModel(modelIn);
                    
                     SentenceDetectorME sentenceDetector = new SentenceDetectorME(sModel); // ------ Creating a Sentence detector based on the input stream
                    
                     String articleText = "Chris Gayle on Monday sounded out a warning to the rival teams ahead of the World Twenty20 by declaring that he can score a hundred irrespective of the conditions. “I am capable of scoring a century in any condition and on any wicket in the world. I just want to give the team that kind of a start. It will be nice to get another hundred,” Gayle said. “However it also depends on the conditions as well and how the wicket is playing,” he said. Asked about the tremendous pressure on him to perform every time, when he goes out to bat, the Jamaican dasher said it indeed was a challenge to live up to the expectations. “It creates a lot of pressure as expectations are rising. When you actually set a trend, then people expect you to come good at all times. You have fans worldwide who want me to do well. That’s what they pay for and want to see. But it’s not going to happen all the time but when I do get a chance I try to entertain people as much as possible,” he said. “We are here to retain the title and that’s not going to be easy but we are ready for it and we are ready for the challenges. Our first priority is to make it to the last four, it’s a tough group. Everybody is looking to win the tournament.”";            
              String[] sentences = sentenceDetector.sentDetect(articleText); // -----D
                    
                     int index = 0;
                     for (int i = 0; i < sentences.length; i++) {
                           index++;
                           String sentence = sentences[i];
                           System.out.println("Sentence : " + index  + " " + sentence); // --- printing seach sentence.
                          
                     }
                    
              } catch (Exception e) {
                     System.out.println("Exception : " + e);
                    
              }

       }

}

Output:
Sentence : 1 Chris Gayle on Monday sounded out a warning to the rival teams ahead of the World Twenty20 by declaring that he can score a hundred irrespective of the conditions.
Sentence : 2 “I am capable of scoring a century in any condition and on any wicket in the world.
Sentence : 3 I just want to give the team that kind of a start.
Sentence : 4 It will be nice to get another hundred,” Gayle said.
Sentence : 5 “However it also depends on the conditions as well and how the wicket is playing,” he said.
Sentence : 6 Asked about the tremendous pressure on him to perform every time, when he goes out to bat, the Jamaican dasher said it indeed was a challenge to live up to the expectations.
Sentence : 7 “It creates a lot of pressure as expectations are rising.
Sentence : 8 When you actually set a trend, then people expect you to come good at all times.
Sentence : 9 You have fans worldwide who want me to do well.
Sentence : 10 That’s what they pay for and want to see.
Sentence : 11 But it’s not going to happen all the time but when I do get a chance I try to entertain people as much as possible,” he said.
Sentence : 12 “We are here to retain the title and that’s not going to be easy but we are ready for it and we are ready for the challenges.
Sentence : 13 Our first priority is to make it to the last four, it’s a tough group.
Sentence : 14 Everybody is looking to win the tournament.”

-----------------------------------------------------------------------------

Similarly the following code, tokenizes the words from the same article:

InputStream modelIn = new FileInputStream(
"src/main/resources/models/en-token.bin");
TokenizerModel tModel = new TokenizerModel(modelIn);

TokenizerME tokenizer = new TokenizerME(tModel);

String articleText = "Chris Gayle on Monday sounded out a warning to the rival teams ahead of the World Twenty20 by declaring that he can score a hundred irrespective of the conditions. “I am capable of scoring a century in any condition and on any wicket in the world. I just want to give the team that kind of a start. It will be nice to get another hundred,” Gayle said. “However it also depends on the conditions as well and how the wicket is playing,” he said. Asked about the tremendous pressure on him to perform every time, when he goes out to bat, the Jamaican dasher said it indeed was a challenge to live up to the expectations. “It creates a lot of pressure as expectations are rising. When you actually set a trend, then people expect you to come good at all times. You have fans worldwide who want me to do well. That’s what they pay for and want to see. But it’s not going to happen all the time but when I do get a chance I try to entertain people as much as possible,” he said. “We are here to retain the title and that’s not going to be easy but we are ready for it and we are ready for the challenges. Our first priority is to make it to the last four, it’s a tough group. Everybody is looking to win the tournament.”";
String[] tokens = tokenizer.tokenize(articleText);

int index = 0;
String tokenString = "";
for (int i = 0; i < tokens.length; i++) {
index++;
tokenString = tokenString + tokens[i] + "|";
}
System.out.println("No. of tokens : " + tokenString.length());
System.out.println(tokenString);

Output:
No. of tokens : 1244
Chris|Gayle|on|Monday|sounded|out|a|warning|to|the|rival|teams|ahead|of|the|World|Twenty20|by|declaring|that|he|can|score|a|hundred|irrespective|of|the|conditions|.|“|I|am|capable|of|scoring|a|century|in|any|condition|and|on|any|wicket|in|the|world|.|I|just|want|to|give|the|team|that|kind|of|a|start|.|It|will|be|nice|to|get|another|hundred|,|”|Gayle|said|.|“However|it|also|depends|on|the|conditions|as|well|and|how|the|wicket|is|playing|,|”|he|said|.|Asked|about|the|tremendous|pressure|on|him|to|perform|every|time|,|when|he|goes|out|to|bat|,|the|Jamaican|dasher|said|it|indeed|was|a|challenge|to|live|up|to|the|expectations|.|“It|creates|a|lot|of|pressure|as|expectations|are|rising|.|When|you|actually|set|a|trend|,|then|people|expect|you|to|come|good|at|all|times|.|You|have|fans|worldwide|who|want|me|to|do|well|.|That’s|what|they|pay|for|and|want|to|see|.|But|it|’s|not|going|to|happen|all|the|time|but|when|I|do|get|a|chance|I|try|to|entertain|people|as|much|as|possible|,|”|he|said|.|“We|are|here|to|retain|the|title|and|that|’s|not|going|to|be|easy|but|we|are|ready|for|it|and|we|are|ready|for|the|challenges|.|Our|first|priority|is|to|make|it|to|the|last|four|,|it|’s|a|tough|group|.|Everybody|is|looking|to|win|the|tournament|.|”|


Wednesday, March 5, 2014

Spring Annotation based configuration - reading from property file

Spring can be configured using context xml files or using java annotations.

@Configuration can be used if you want to configure using java.

Full configuration code:

@Configuration
@EnableWebMvc
@ComponentScan(basePackages = "com.cv")
@EnableMongoRepositories(basePackages = { "com.cv.framework.mongorepositories",
              "com.cv.app.repository" })
@PropertySource(value = "classpath:app.properties")
public class AppConfig extends WebMvcConfigurerAdapter
       @Inject
       Environment environment;
       @Bean
       public InternalResourceViewResolver configureInternalResourceViewResolver() {
              InternalResourceViewResolver resolver = new InternalResourceViewResolver();
              resolver.setPrefix("/WEB-INF/views/");
              resolver.setSuffix(".jsp");
              return resolver;
       }
       @SuppressWarnings("deprecation")
       @Bean
       public Mongo mongo() throws Exception {
              return new Mongo(environment.getProperty("ip"));
       }

       @Bean
       public MongoTemplate mongoTemplate() throws Exception {
              return new MongoTemplate(mongo(),
                           environment.getProperty("mdbname"));
       }

}

@Configuration tells that the current java file is a spring configuration file. 
@ComponentScan tells the container to scan all the java files that starts with the package com.cv

@EnableMongoRepository informs the container where to search for mongo Repositories. 

@PropertySource annotation defines the property file this annotation java file uses to pull configurable values for the application. 

Environment is a spring api class which is used to read from the properties file. 
environment.getProperty("mdbname") tries to read the property mdbname from the app.properties file. 

Java based spring configuration is easier than the xml configuration since it throws all the errors at compile time and the context sensitive help comes very handy. 

Some developers prefer to keep some configuration in the xml to locate easily. 



Spring security using Java configuration

Authentication can be done in 2 ways in spring - using context xml files or using the latest java based configuration. This article explains how to implement spring security using java configurations.
The below steps need to be followed in order to configure security in spring applications.

  • Define spring security security filter chain.
  • Create custom user details service
  • Security configuration


1Define spring security security filter chain.
       public class WebAppInitializer implements WebApplicationInitializer {
              public void onStartup(ServletContext servletContext)
                     throws ServletException {

              .
              .
              .
             
              servletContext.addFilter("springSecurityFilterChain",
                           new DelegatingFilterProxy("springSecurityFilterChain"))
                     .addMappingForUrlPatterns(EnumSet.allOf(DispatcherType.class),
                                         true, "/*");
              .
              .
              .
              }
       }
TThe above approach eliminates the need for using a web.xml. The springsecurityfilterchain is defined in this java
fifile instead of web.xml.


Create custom user details service class.
       @Service
       public class AppUserDetailsService implements UserDetailsService {

              @Autowired
              private AppUserRepository appUserRepository;

              @Override
              @Transactional
              public UserDetails loadUserByUsername(String userId)
                     throws UsernameNotFoundException {
                     UserValue userValue = null;
                     List<GrantedAuthority> grantedAuthorities = new  ArrayList                               <GrantedAuthority>();
                     AppUser appUser = null;
                     appUser = appUserRepository.findByUserId(userId);
                     if (appUser != null) {
                            grantedAuthorities.add(new SimpleGrantedAuthority(appUser
                                  .getAppRole().getName()));
                            userValue = new UserValue(
                                  appUser.getId(), appUser. getUserId(),
                                  appUser.getPassword(), grantedAuthorities,
                                   appUser.getFirstName(), appUser.getLastName());
                     }
                     return userValue;
              }
       }
 There are different ways to configure security, this approach uses a custom User details service which loads the user and role information from the db and returns it to the framework, the framework then stores the user and role information in the session for further processing

1Create security configuration java class

@Configuration
@EnableWebSecurity
@EnableGlobalMethodSecurity(prePostEnabled = true)
public class SecurityConfig extends WebSecurityConfigurerAdapter {

       @Autowired
       AppUserDetailsService appUserDetailsService;

       @Autowired
       public void configureGlobal(AuthenticationManagerBuilder auth)
                     throws Exception {
              auth.userDetailsService(appUserDetailsService);
       }

       @Override
       public void configure(WebSecurity builder) throws Exception {
              builder.expressionHandler(webexpressionHandler()).ignoring()
                           .antMatchers("/resources/**");
       }

       @Bean(name = "webexpressionHandler")
       public DefaultWebSecurityExpressionHandler webexpressionHandler() {
              return new DefaultWebSecurityExpressionHandler();
       }

       @Override
       protected void configure(HttpSecurity http) throws Exception {
              http.csrf().disable().authorizeRequests().
                                  .antMatchers("/loginPage").permitAll().
                                  anyRequest().fullyAuthenticated().and()                                                   .formLogin().loginPage("/loginPage").
                                  loginProcessingUrl("/j_spring_security_check")
                                  .usernameParameter("j_username")
                                   .passwordParameter("j_password").
                                  failureUrl("/errorPage")
                                   .defaultSuccessUrl("/myhome").permitAll().and().
                                  logout().logoutUrl("/j_spring_security_logout")
                     .logoutSuccessUrl("/loginPAge").deleteCookies("JSESSIONID")
                           .invalidateHttpSession(true);
       }
}



The configure method can be used to define the loginpage, logout page, the success url, failure url and whether to delete cookies and invalidate session while logout. 
  • The configure method which accepts WebSecurity can be used to tell the resource folders so that those can be ignored by the security frameworks.
  •  Now all the urls except the one in the resources folder will be intercepted by the security framework and will be redirected to the login page
  •  Once the user enters the user id and password, the AppUserDetailsService.loadUserByUsername will be called by passing the current login userid. This method loads user information and passes it to the security framework
  •  The framework then validates the credentials and if it is successful, then displays the home page based on the configuration defined in the security config class.
  •  When the user logs out, the session is invalidated and the controls goes back to the login page.