miércoles, 31 de octubre de 2012

Delegando Tareas con App Engine Task Queues

Contenido de la presentación para el Google Dev Fest en Barcelona (9/11/2012)

Ver otras ponencias

Título:
Dejando tareas a Google App Engine: Introducción a Task Queues

Presentacion: https://www.box.com/files/0/item/f_3909761486

Contenido de la Presentación:

Introducción
  • Concepto General: herramienta para los procesos "background"
    • Creado por el limite de 30 segundos por peticion a 10 minutos
  • Visión general del API: los dos tipos de queues: push vs pull
  •  
  • Casos de Uso
    • Migración de Datos
    • Procesos Batch
    • Sincronización Avanzada
    • Integración con crons
Uso / Desarrollo / Cómo Dividir Tareas
  • Conceptos generales para dividir y conquistar (Divide and conquer)
  • Framework de Google y su Parametrización de las tareas
  • Cómo llamar a los "workers" y consideraciones especiales
    • idempotency: qué es
    • consideración para el entorno "cloud"
Configuración y Control
  • Técnicas de afinamiento
  • Repasando los atributos de una queue
    • frecuency/rate
    • bucket size
    • etc
  • Tu amigo el "control panel"
  • Versionado
Mejores Prácticas
  • Cómo diseñar para parar en "masa"
  • Control de tareas
Otros Frameworks y Herramientas
  • Map Reduce
  • appengine-pipeline

sábado, 18 de agosto de 2012

Analyzing Web Performance: love my RUM

It was news to me the other day while watching my favorite tv channel (Google Developers on YouTube) that I ran into this presentation about web performance. Previously I had no idea that modern browsers could measure metrics like "dns lookup", "server connection time", "server response time". With the awesome work from the W3C Performance Working Group, now this is a reality.

Added to the new and latests trends in continuous build cycle and automated testing, now there is the fact that you can get your RUM (Real User Metrics) as part of Google Analytics. There is always the need to do your performance testing but you can also see how the site evolves and how you could even monitor in real time what is going on. So I went I checked the Google Analytics and indeed, they are providing with all this data.

I remember a long while ago when a client had some problems with performance and we were working in the dark until we found out it was the dns lookup time! Imagine how much easy it would have been had we used this data.

For the curious I am posting the image of the lifecycle for a request. All this data could be obtained from the "Developers tool" from most modern browsers. Safari is still lagging.


domingo, 22 de julio de 2012

Google App Engine Research Award

I am happy to inform that the project that I am collaborating with Dr. Enrique Vivoni and Giuseppe Mascaro was selected as a finalist for the Google App Engine Research Award.

Objective
The project proposes new ways to visualize and accesss data set for global climate data. Our first step is providing access to land surface temperature.

Background Information
Progress has been made through ground and space-based observing networks and the development of sophisticated numerical models. A class of these models, known as Land Surface Models or LSMs, simulates terrestrial water and energy processes and their two-way interaction with the Earth's atmosphere.

We plan to base our analysis on the Global Land Data Assimilation System (GLDAS), jointly developed by NASA and NOAA, and uses a series of four LSMs to provide global, 0.25 and 1 degree, 3-hourly fields of a wide range of geophysical variables over the period of 1979 to present.

Procedures for downloading, visualizing and interpreting the data requires advanced technical skills typically possessed by scientist involved in geophysical disciplines. We try to make it easy for normal users.

Execution
Our initial step is to focus only on land surface temperature (LST) at 3-hourly resolution from the period 1980 to 2010 as a target variable. We expect the query tools to allow a range of inquiry into the land surface temperature distribution at:
  1. Specific location
  2. Across different time periods
  3. For memorable events (for example, Hurricane Katrina)
Stay on for future post on how we are going to develop a set of data visualization tools.

viernes, 6 de julio de 2012

How-to: Preparando Presentaciones para GTUG Barcelona

Si vas a presentar en el GTUG, vale la pena repasar los siguientes puntos que te ayudaran a prepararte para la presentación.

Q&A: Test Run!
Igual que en software, vale la pena realizar un "acceptance test" o quizás una prueba unitaria acerca del contenido de tu presentación. Es mi experiencia que siempre tiendo a mezclar más temas de los que debería, así que mantenerlo simple es a veces mas claro a la audiencia. Si alguien te puede escuchar o si puedes contrastar la tabla de contenido con algún colega, te dará una buena perspectiva de la relevancia.

Resolución 800 x 600
Recordar que cuando se presenta la resolución en pantalla es menor que la que usualmente tenemos en el escritorio. Vale la pena practicar tener eclipse en una resolución de 800x600 para tener una idea de como agrupar los elementos en pantalla.

Doble Monitor
Para los que usamos dos monitores en la oficina y usamos Windows, recordar que por defecto, se tiene la proyección de Microsoft Powerpoint en el monitor principal. Esto provoca que no se muestre en pantalla al momento de proyectar. Antes de ir a la presentación, habría que cambiar la configuración y practicar que cuando ejecutamos la presentación en Powerpoint, se proyecta en el monitor secundario.

Visualización de Código Fuente
Intentar cambiar de pantalla con lentitud para no marear al publico. Aunque normalmente cambiamos de pantalla con facilidad, cuando proyectamos es mejor hacerlo con mas lentitud y lo menos posible. De esta manera guardamos el enfoque.

Basado en la experiencia de expertos (como la peña de Google I/O) se debería limitar al máximo el código que se muestra en Eclipse y decantarse por usar más las diapos para mostrar solo extractos de código. Si estás pensando en mostrar una clase entera, vamos mal. Debería ser mucho más concreto y enfocarte a nivel de funciones y bloques en particular. 

Interactividad
No olvidar hacer preguntas al publico. Puedes pedir sugerencias, lecciones aprendidas, pasadas experiencias o simplemente si tienen algo que compartir -- bueno o malo.

Audiencia
Recordar mantener el enfoque en la audiencia y no en el monitor local. Aunque es difícil, tratar siempre de ver al publico para ver si todavía están ahí :)

Código Fuente
Falta hace que el código presentado este probado y que compile. En caso de algún error, indicarlo como "known issues". Es una buena práctica colgarlo en algún sitio. (Ver Google code o github ).

Get Social
Recordar tener abierto tu cuenta de Twitter para poder recibir preguntas y aceptar nuevas amistades.

Update
Recordar que la audiencia de GTUG es una de las más motivadas por las nuevas tecnologías, por lo que si vas a presentar acerca de algún producto, conocer las ultimas versiones, etc.

Cualquier otro punto que se les ocurra, es bienvenida.

sábado, 19 de mayo de 2012

Presentación en Barcelona GTUG

Paso a detallar los puntos que sugiero presentar para el GTUG de Barcelona en la próxima reunion. Cualquier sugerencia es bienvenida, así que si tienen alguna, solo basta que me envíen un email o que comenten en el blog.

Título: Implementando RESTful en GAE
Cuando: Miércoles 20 de Junio 2012
Hora: 18:15h
Lugar:
FIB - UPC Campus Nord - Sala de actos B6
Calle Jordi Girona, 31, 08034 Barcelona, Spain
08034

Temario

  • Repaso a configurar un proyecto Java en GAE
  • Uso de persistencia con JDO
  • Gestion de las instancias y servicios CRON
  • Configuración del plugin de maven para GAE y JAVA
  • Libreria RESTlet para Android y GAE
  • Ejemplos de implementacion en un sitio web real
  • Discutir patrones de interacción/sincronización con servicios RESTful
  • Manejo de Errores en la capa de comunicación 
  • Uso de threads para mejor respuesta
  • Implementación RESTful en Android tomando en cuenta el ciclo de vida de una actividad

Controlling your Task Queues

Summary
One of the key features in any framework for executing tasks (serializable and persistents) is how you can control them when things go wrong. This posting discusses some of the control panel of the GAE task queue. Details of the process is described in another article.

Previous Work
On my previous entry I wrote about the plans of creating automated tasks to process 130,000 transactions.Today I write about how things did not go as expected.

Executing the Tasks
Everything went out of control and very quickly. There were some important limits that I needed to respect in the Google Maps API. The most important wass the 120,000 transactions a day limit. With that I had no trouble. However, there must be a limit of transactions per minute that is nowhere published. With the limits in the GAE tasks I had no trouble.

However, I realized that after processing around 800 transactions, the google Map API was giving me errors. When I looked at the log, I realized I had fullfilled the quota for the day. I initially expected the quota to be 2000 transactions a day. But whenever I started executing the tasks, I set the limit to 5 transactions a second. I quickly reached a total of 300 transactions a minute, thus fulfilling my quota.

Since I promised to my wife that I would have all the data processed by Thrusday, I started fearing that I had to sleep in the couch. The previous day I was bragging about all the computer power that I had available and today i had reached the maximum.

Controlling the Tasks
The console for the tasks in GAE is your best friend in this circumstances. Also make sure that during the different loads generated that you follow the best practices for loads.

The tasks api actually worked as expected but the Google Maps API was the first thing to break down. I had anticipated a processing of 200

lunes, 14 de mayo de 2012

Integration as a Service in the Cloud


It is no surprise that the new wave of system integration tools are being designed with the cloud in mind. What really surprises me is the easiness that allows a very heterogeneous environment seem completely simple to the developers that use it. This article discusses the latest features of Apache SMX Fuse (http://fusesource.com/) , a systems integration platform for the cloud.

For those who haven’t tried Apache SMX Fuse, it is an excellent integration solution. It is based on several Apache projects: CXF, HttpClient, Camel, Zookeeper, among others. And allows the integration of almost any type of system: via ftp, file, web service (SOAP and RESTful), email, plain text file, etc. We have use it for several projects and is a very powerful tool.

Going back to the IaaS for the cloud, one of the most difficult things in configuring high reliable systems (clusters, pools, master/slaves, you name it) is the configuration management (CM) -- I am following here the SE Book from T-Systems. Some solutions like Oracle Weblogic are somehow better than others but at the end, when you are creating containers, app modules, shared libraries, etc it is very difficult to be error free. That’s why I was glad to see that the folks at SMX Fuse have applied some very good patterns:

Inheritance
One of the pilars of Object Oriented Design, inheritance allows you to provide a very efficient configuration of modules. Based on Zookeeper (http://zookeeper.apache.org/), the Apache project, it helps you to be consistent and achieve excellent repeatability.

Provide Different Sizes: Small, Medium, Large
Why order a large pizza when you only need a slice? That’s definitely a problem when dealing with these high replication systems: they are usually really heavy and complex. Apache SMX Fuse now comes with different sizes that allows you to use just what you need.

Bet on Auto-Discovery
Yeah, we all have problems with the famous JNDI registry, but it is an start. I have seen some nice features of this integration platform that is really worth looking. An excellent fallout system that allows the delivery of messages through different routes: JMS, WebServices, etc. and do not even worrying if the system is on-premise or on-public.

The bottom line of CM for the cloud is to have zero downtime for the developers, reduce all the proliferation of dirty system (VMWare images everywhere ), and ease the rollout.

lunes, 2 de abril de 2012

Cloud Enhancement for your Smartphone

A couple of weeks ago I decided to develop an app intended to replace our grocery list at home -- we normally keep all items in a sheet of paper. The first alternative was to do the web app in HTML5 since everybody is talking about the power of the new javascript libraries, etc. so i couldn’t wait to implement it.

For the HTML5 version I developed a website (soon to be revealed to the general public) that allows users to keep grocery items under control. The website turned out to be more advanced than what I was expecting. It has plenty of business logic. There were items that if labeled weekly, would require a reschedule. There were other items that were “Reminder Only” to be scheduled some day in the future. And there were some other items that required a description in case there was a sale. Yes, my wife and I went crazy with the design. Therefore, I kept all the rules in the business layer and worked on the presentation layer leveraging on html5 to reduce the development time. (I used GAE for the cloud).

The first HTML5 release was a total failure. Although I did my testing, I did not realized how bad the 3G network coverage inside the commercial center is. Each time I marked an item as “checked out” it took forever for the connection to respond. I was always lagging a couple of aisles behind my wife trying to keep up with what had already been bought and what was pending. No wonder why I marked that we had bought soy milk when in fact we did not! (We just bought milk).

So it was time to go RESTful and do some more cloud enhancements. For the new redesign, most of the data is kept local vs 100% online. At the time the Android App is initiated, the app downloads the lists and the items for the specific user. It then saves them in the local database. The user then edits the items locally and syncs them at will. This design proved to be better than the HTML5 version. It was faster and more responsive.

Cloud Experience
It was pretty easy to set up the RESTful services in GAE. Although they do not provide a framework for those services, the open source community is pretty active on this area. I used Restlet, a framework with a pretty light library and pretty active user community, fresh documentation, and easy to use API. And because my friends always complain that I use Python, this time I used Java for GAE. The Restlet Framework has customized a release for Android apps, too. I also noticed that datanucleus, the persistence framework behind the Java GAE, also had RESTful interface but I did not use it.

There are some factors to consider while exposing your RESTful services using the cloud. You have to think about the warm-up period. Due to the nature of the cloud, most providers can put your instance to sleep if there is no activity. When the instance is brought back to life, your end user looses some precious seconds loading content. So I activated the warm-up functions in GAE and also updated some of the cron definitions for activating my persistence manager. After that the response time was down from 10 seconds to 600 ms. Huge improvements.

Other Frameworks
I was amazed at how many frameworks are there for Android and iOS cloud offerings. There are companies that give you an SDK and basically you can store all your data in their cloud for a fee. They take care of caching, serialization, security, geolocation, and performance. Funny because those are the same services that you already get from GAE or Azure. Some of the companies that I checked were Parse , Cocoafish , Urban Airship ,and Kii Cloud to name a few.
Conclusions
There is definitely an advantage in going cloud for Smartphone apps. One of the key factors for the clod storage is that most datastores are NoSQL. Therefore, your schemas can be modified without the pain of foreign keys or any restrictions which is great for the release cycle of Smartphone Apps that usually is every two weeks. So it is a very good solution.

jueves, 26 de enero de 2012