LOGIN   :::   RECOVER PASS   :::   GET ACCOUNT    
Browse
  • Projects
  • Code (CVS)
  • Forums
  • News
  • Articles
  • Polls
  •  
    OpenCores
  • FAQ
  • CVS HowTo
  • Mission
  • Media
  • Tools
  • Advertise
  • Mirrors
  • Logos
  • Contact us
  • Find Resources
  • Job Opportunity
  •  
    Tools
  • Search
      
  • Download Cores (CVSGet)
  •  
    More
  • Wishbone
  • Perlilog
  • EDA tools
  • OpenTech CD
  •  
    Navigation: All forums > Cvs-checkins > Message List > Message Post

    Message

    Reply | Reply all
    Date Prev | Date Next | Thread Prev | Thread Next Date Index | Thread Index

    From: cvs at opencores.org<cvs@o...>
    Date: Sat Aug 25 20:01:15 CEST 2007
    Subject: [cvs-checkins] MODIFIED: jop ...
    Top
    Date: 00/07/08 25:20:01

    Added: jop/doc/book/outro outro.tex publications.tex
    Log:
    Handbook update


    Revision Changes Path
    1.1 jop/doc/book/outro/outro.tex

    http://www.opencores.org/cvsweb.shtml/jop/doc/book/outro/outro.tex?rev=1.1&content-type=text/x-cvsweb-markup

    Index: outro.tex
    ===================================================================

    In this chapter we will undertake a short review of the project and
    summarize the contributions. Java for real-time systems is a very
    new and active research area. This chapter is completed by
    suggestions for future research, based on the proposed Java
    processor.

    \section{Conclusions}

    In the following list, we draw conclusions about the Java processor
    presented in this document, in relation to the problem stated in
    Section~\ref{sec:resques}:
    \begin{enumerate}
    \item
    A time-predictable Java platform has been demonstrated. As shown in
    Section~\ref{sec:rtpredict} and \ref{sec:cache}, the architectural
    design decisions and a time-predictable cache provide the basis for
    a time-predictable Java processor. In Section~\ref{sec:wcet}, it was
    shown that all bytecodes have a known WCET and there are no pipeline
    dependencies. JOP's architecture can therefore be modeled
    cycle-accurately for the low-level WCET analysis.
    \item
    The implementation of a RISC-style stack architecture, with a novel
    mapping of Java bytecodes to microcode addresses (see
    Section~\ref{sec:microcode}), and the analysis of the JVM stack
    usage pattern (see Section~\ref{sec:stack}) with the
    resource-efficient two-level stack cache resulted in a small design.
    In fact, JOP is the smallest implementation of the JVM in hardware
    available to date.
    \item
    The usage of JOP in real-world applications, as described in
    Section~\ref{sec:applications}, shows that JOP is a working
    processor and not only a theoretical architecture.
    \item
    Comparing JOP with various embedded Java solutions in
    Section~\ref{sec:performance} showed that the time-predictable
    processor architecture does not need to be slow. JOP's average
    performance is similar to that of non real-time Java systems.
    \item
    The flexibility of an FPGA allows for a HW/SW-co-design approach,
    with the aim of generating application-specific configurations of
    JOP.
    \item
    In Section~\ref{sec:rtprof}, a simple real-time profile for Java was
    defined. This profile solves a number of issues that arise from
    using standard Java for real-time systems. This profile was
    elaborated upon in Section~\ref{sec:usersched} to create a framework
    for a user-defined scheduler in Java, thus enabling the
    implementation of advanced scheduling concepts at the application
    level.
    \end{enumerate}

    \section{Summary of Contributions}

    The research contributions made by this work are related to two
    areas: real-time Java and resource-constrained embedded systems.

    \subsubsection{A Real-Time Java Processor}

    The goal of time-predictable execution of Java programs was a
    first-class guiding principle throughout the development of JOP:

    \begin{itemize}

    \item
    The execution time for Java bytecodes can be exactly predicted in
    terms of the number of clock cycles.
    % The execution time for Java bytecodes is known cycle-accurate.
    JOP is therefore a straightforward target for low-level WCET
    analysis. There is no mutual dependency between consecutive
    bytecodes that could result in unbounded timing effects.

    \item
    In order to provide time-predictable execution of Java bytecodes,
    the processor pipeline is designed without any prefetching or
    queuing. This fact avoids hard-to-analyze and possibly unbounded
    pipeline dependencies. There are no pipeline stalls, caused by
    interrupts or the memory subsystem, to complicate the WCET analysis.

    \item
    A pipelined processor architecture calls for higher memory
    bandwidth. A standard technique to avoid processing bottlenecks due
    to the higher memory bandwidth is caching.
    %In order to fill the gap between processor speed and the memory
    %access time, caches are mandatory, even in embedded systems.
    However, standard cache organizations improve the average execution
    time but are difficult to predict for WCET analysis. Two
    time-predictable caches are proposed for JOP: a \emph{stack cache}
    as a substitution for the data cache and a \emph{method cache} to cache the instructions. As the stack is a heavily accessed memory region, the stack -- or part of it -- is placed in local memory. This part of the stack is referred to as the \emph{stack cache} and described in Section~\ref{sec:stack}. Fill and spill of the stack cache is subjected to microcode control and therefore time-predictable. In Section~\ref{sec:cache}, a novel way to organize an instruction cache, as \emph{method cache}, is given. The cache stores complete methods, and cache misses only occur on method invocation and return. Cache block replacement depends on the call tree, instead of instruction addresses. This \emph{method cache} is easy to analyze with respect to worst-case behavior and still provides substantial performance gain when compared against a solution without an instruction cache. \item The above described time-predictable processor provides the basis for real-time Java. The issues with standard Java and the Real-Time Specification for Java were analyzed in Chapter~\ref{chap:rtjava}. To enable real-time Java to operate on resource-constrained devices, a simple real-time profile was defined in Section~\ref{sec:rtprof} and implemented in Java on JOP. The beauty of this approach is in implementing functions usually associated with an RTOS in Java. This means that real-time Java is not based on an RTOS, and therefore not restricted to the functionality provided by the RTOS. With JOP, a self-contained real-time system in pure Java becomes possible. The tight integration of the scheduler and the hardware that generates schedule events results in low latency and low jitter of the task dispatch. \item %The timer interrupt in JOP generates interrupts at the release times %of the tasks. The scheduler is responsible for reprogramming the %timer after each occurrence of a timer interrupt. Controlling the %timer interrupt as part of the scheduling results in the low jitter %of periodic tasks. % The defined real-time profile suggests a new way to handle hardware interrupts to avoid interference between blocking device drivers and application tasks. Hardware interrupts other than the timer interrupt are represented as asynchronous events with an associated thread. These events are \emph{normal} schedulable objects and subject to the control of the scheduler. With a minimum interarrival time, these events, and the associated device drivers, can be incorporated into the priority assignment and schedulability analysis in the same way as normal application tasks. \end{itemize} The above-described contributions result in a time-predictable execution environment for real-time applications written in Java, without the resource implications and unpredictability of a JIT-compiler. The proposed processor architecture is a straightforward target for low-level WCET analysis. %New applications, such as multimedia streaming, result in %\emph{soft} real-time systems that need a more flexible scheduler %than the traditional fixed priority-based ones. Implementing a real-time scheduler in Java opens up new possibilities. The scheduler is extended to provide a framework for user-defined scheduling in Java. In Section~\ref{sec:usersched}, we analyzed which events are exposed to the scheduler and which functions from the JVM need to be available in the user space. A simple-to-use framework to evaluate new scheduling concepts is given. \subsubsection{A Resource-Constrained Processor} Embedded systems are usually very resource-constrained. Using a low-cost FPGA as the main target technology forced the design to be small. The following architectural features address this issue: \begin{itemize} \item The architecture of JOP is best described as: \begin{quote} The JVM is a CISC stack architecture, whereas JOP is a RISC stack architecture. \end{quote} JOP contains its own instruction set, called microcode in this thesis, with a novel way of mapping bytecodes to microcode addresses. This mapping has zero overheads as described in Section~\ref{sec:microcode}. Basic bytecode instructions have a one-to-one mapping to microcode instructions and therefore execute in a single cycle. The stack architecture allows compact encoding of microinstructions in 8 bit to save internal memory. This approach allows flexible implementation of Java bytecodes in hardware, as a microcode sequence or even in Java itself. \item The analysis of the JVM stack usage pattern in Section~\ref{sec:stack} led to the design of a resource-efficient two-level stack cache. This two-level stack cache fits to the embedded memory technologies of current FPGAs and ASICs and ensures fast execution of basic instructions. Part of the stack cache, which is implemented in an on-chip memory, is also used for microcode variables and constants. This resource sharing does not only reduce the number of memory blocks needed for the processor, but also the number of data paths to and from the execution unit. \item Interrupts are considered hard to handle in a pipelined processor, resulting in a complex (and therefore resource consuming) implementation. In JOP, the above mentioned bytecode-microcode mapping is used in a clever way to avoid interrupt handling in the core pipeline. % %An implementation of interrupts at the bytecode-microcode mapping %keeps interrupts transparent in the core pipeline and avoids complex %logic. Therefore, % Interrupts generate special bytecodes that are inserted in a transparent way in the bytecode stream. Interrupt handlers can be implemented in the same way as bytecodes are implemented: in microcode or in Java. \end{itemize} The above design decisions where chosen to keep the size of the processor small without sacrificing performance. JOP is the smallest Java processor available to date that provides the basis for an implementation of the CLDC specification (see Section~\ref{subsec:cldc}). JOP is a fast execution environment for Java, without the resource implications and unpredictability of a JIT-compiler. The average performance of JOP is similar to that of mainstream, non real-time Java systems. JOP is a flexible architecture that allows different configurations for different application domains. Therefore, size can be traded against performance. As an example, resource intensive instructions, such as floating point operations, can be implemented in Java. The flexibility of an FPGA implementation also allows adding application-specific hardware accelerators to JOP. The small size of the processor allows usage of low-cost FPGAs in embedded systems that can compete against standard microcontroller. JOP has been implemented in several different FPGA families and is used in different real-world applications. Programs for embedded and real-time systems are usually multi-threaded and a small design provides a path to a multi-processor system in a mid-sized FPGA or in an ASIC. A tiny architecture also opens new application fields when implemented in an ASIC. Smart sensors and actuators, for example, are very sensitive to cost, which is proportional to the die area. \section{Future Research Directions} JOP provides a basis for various directions for future research. Some suggestions are given below: % \begin{description} \item[Real-time garbage collector:] In Section~\ref{sec:gc}, a real-time garbage collector was presented. Hardware support of a real-time GC would be an interesting topic for further research. Another question that remains with a real-time GC is the analysis of the worst-case memory consumptions of tasks (similar to the WCET values), and scheduling the GC so that it can keep up with the allocation rate. \item[Hardware accelerator:] The flexibility of an FPGA implementation of a processor opens up new possibilities for hardware accelerators. We have shown in Section~\ref{sec:hwsw:co} how the implementation of a bytecode can be moved between hardware and software. A further step would be to generate an application specific-system in which part of the application code is moved to hardware. Ideally, the hardware description should be extracted automatically from the Java source. Preliminary work in this area, using JOP as its basis, can be found in \cite{jop:sac05}. \item[Hardware scheduler:] In JOP, scheduling and dispatch is done in Java (with some microcode support). For tasks with very short periods, the scheduling overheads can prove to be too high. A scheduler implemented in hardware can shorten this time, due to the parallel nature of the algorithm. \item[Multiprocessor JVM:] In order to generate a small and predictable processor, several advanced and resource-consuming features (such as instruction folding or branch prediction) were omitted from the design. The resulting low resource usage of JOP makes it possible to integrate more than one processor in an FPGA. Since embedded applications are naturally multi-threaded systems, the performance can easily be enhanced using a multi-processor solution. A multi-processor JVM with shared memory offers following research possibilities: scheduling of Java threads and synchronization between the processors; WCET analysis for the shared memory access. First results on a JOP CMP are described in \cite{jop:dma, jop:cmp}. \item[Instruction cache:] The cache solution proposed in Section~\ref{sec:cache} provides predictable instruction cache behavior while, in the average case, still performing in a similar way to a direct-mapped cache. However, an analysis tool for the worst-case behavior is still needed. With this tool, and a more complex analysis tool for traditional instruction caches, we also need to verify that the worst-case miss penalty is lower than with a traditional instruction cache. A second interesting aspect of the proposed method cache is the fact that the replacement decision on a cache miss only occurs on method invoke and return. The infrequency of this decision means that more time is available for more advanced replacement algorithms. \item[Real-time Java:] Although there is already a definition for real-time Java, i.e.\ the RTSJ \cite{rtsj}, this definition is not necessarily adequate. There is ongoing research on how memory should be managed for real-time Java applications: scoped memory, as suggested by the RTSJ, usage of a real-time GC, or application managed memory through memory pools. However, almost no research has been done into how the Java library which has proven a major part of Java's success, can be used in real-time systems or how it can be adapted to do so. The question of what the best memory management is for the Java standard library remains unanswered. \item[Java computer:] How would a processor architecture and operating system architecture look in a `Java only' system? Here, we need to rethink our approach to processes, protection, kernel- and user-space, and virtual memory. The standard approach of using memory protection between different processes is necessary for applications that are programmed in languages that use memory addresses as data, i.e.\ pointer usage and pointer manipulation. In Java, no memory addresses are visible and pointer manipulation is not possible. This very important feature of Java makes Java a \emph{safe} language. Therefore, an error-free JVM means we do not need memory protection between processes and we do not need to make a distinction between kernel and user space (with all the overhead) in a Java system. Another reason for using virtual addresses is link addresses. However, in Java this issue does not exist, as all classes are linked dynamically and the code itself (i.e.\ the bytecodes) only uses relative addressing. Another issue here is the paging mechanism in virtual memory system, which has to be redesigned for a Java computer. For this, we need to merge the virtual memory management with the GC. It does not make sense to have a virtual memory manager that works with plain (e.g.\ 4KB) memory pages without knowledge about object lifetime. We therefore need to incorporate the virtual memory paging with a generational GC. The GC knows which objects have not been accessed for a long time and can be swapped out to the disc. Handling paging as part of the GC process also avoids page fault exceptions and thereby simplifies the processor architecture. Another question is whether we can substitute the process notation with threads, or whether we need several JVMs on a Java only system. It depends. If we can live with the concept of shared static class members, we can substitute heavyweight processes with lightweight threads. It is also possible that we would have to define some further thread local data structures in the operation system. \end{description} % It is the opinion of the author that Java is a promising language for future real-time systems. However, a number of issues remain to be solved. JOP, with its time-predictable execution of Java bytecodes, is an important but nevertheless only a small part of a real-time Java system. 1.1 jop/doc/book/outro/publications.tex http://www.opencores.org/cvsweb.shtml/jop/doc/book/outro/publications.tex?rev=1.1&content-type=text/x-cvsweb-markup Index: publications.tex =================================================================== % This is a copy from /usr/doc \begin{itemize} \item Martin Schoeberl. Using a {J}ava Optimized Processor in a Real World Application. In {\em Proceedings of the First Workshop on Intelligent Solutions in Embedded Systems (WISES 2003)}, pages 165--176, Austria, Vienna, June 2003. \item Martin Schoeberl. Design Decisions for a {J}ava Processor. In {\em Tagungsband Austrochip 2003}, pages 115--118, Linz, Austria, October 2003. \item Martin Schoeberl. {JOP}: {A} {J}ava Optimized Processor. In R.~Meersman, Z.~Tari, and D.~Schmidt, editors, {\em On the Move to Meaningful Internet Systems 2003: Workshop on {J}ava Technologies for Real-Time and Embedded Systems (JTRES 2003)}, volume 2889 of {\em Lecture Notes in Computer Science}, pages 346--359, Catania, Italy, November 2003. Springer. \item Martin Schoeberl. Restrictions of {J}ava for Embedded Real-Time Systems. In {\em Proceedings of the 7th IEEE International Symposium on Object-Oriented Real-Time Distributed Computing (ISORC 2004)}, pages 93--100, Vienna, Austria, May 2004. \item Martin Schoeberl. Design Rationale of a Processor Architecture for Predictable Real-Time Execution of {J}ava Programs. In {\em Proceedings of the 10th International Conference on Real-Time and Embedded Computing Systems and Applications (RTCSA 2004)}, Gothenburg, Sweden, August 2004. \item Martin Schoeberl. Real-Time Scheduling on a {J}ava Processor. In {\em Proceedings of the 10th International Conference on Real-Time and Embedded Computing Systems and Applications (RTCSA 2004)}, Gothenburg, Sweden, August 2004. \item Martin Schoeberl. {J}ava Technology in an {FPGA}. In {\em Proceedings of the International Conference on Field-Programmable Logic and its applications (FPL 2004)}, Antwerp, Belgium, August 2004. \item Martin Schoeberl. A Time Predictable Instruction Cache for a Java Processor. In Robert Meersman, Zahir Tari, and Angelo Corsario, editors, {\em On the Move to Meaningful Internet Systems 2004: Workshop on {J}ava Technologies for Real-Time and Embedded Systems (JTRES 2004)}, volume 3292 of {\em Lecture Notes in Computer Science}, pages 371--382, Agia Napa, Cyprus, October 2004. Springer. \item Flavius Gruian, Per Andersson, Krzysztof Kuchcinski, and Martin Schoeberl. Automatic generation of application-specific systems based on a micro-programmed java core. In {\em Proceedings of the 20th ACM Symposium on Applied Computing, Embedded Systems track}, Santa Fee, New Mexico, March 2005. \item Martin Schoeberl. Design and implementation of an efficient stack machine. In {\em Proceedings of the 12th IEEE Reconfigurable Architecture Workshop (RAW2005)}, Denver, Colorado, USA, April 2005. IEEE. \item Martin Schoeberl. {\em JOP: A Java Optimized Processor for Embedded Real-Time Systems}. PhD thesis, Vienna University of Technology, 2005. \item Martin Schoeberl. Evaluation of a {J}ava processor. In {\em Tagungsband Austrochip 2005}, pages 127--134, Vienna, Austria, October 2005. \item Martin Schoeberl. A time predictable {J}ava processor. In {\em Proceedings of the Design, Automation and Test in Europe Conference (DATE 2006)}, pages 800--805, Munich, Germany, March 2006. \item Martin Schoeberl. Real-time garbage collection for {J}ava. In {\em Proceedings of the 9th IEEE International Symposium on Object and Component-Oriented Real-Time Distributed Computing (ISORC 2006)}, pages 424--432, Gyeongju, Korea, April 2006. \item Martin Schoeberl. Instruction cache für echtzeitsysteme, April 2006. Austrian patent AT 500.858. \item Rasmus Pedersen and Martin Schoeberl. An embedded support vector machine. In {\em Proceedings of the Fourth Workshop on Intelligent Solutions in Embedded Systems (WISES 2006)}, pages 79--89, Jun. 2006. \item Rasmus Pedersen and Martin Schoeberl. Exact roots for a real-time garbage collector. In {\em Proceedings of the Workshop on {J}ava Technologies for Real-Time and Embedded Systems (JTRES 2006)}, Paris, France, October 2006. \item Martin Schoeberl and Rasmus Pedersen. {WCET} analysis for a {Java} processor. In {\em Proceedings of the Workshop on {J}ava Technologies for Real-Time and Embedded Systems (JTRES 2006)}, Paris, France, October 2006. \item Martin Schoeberl, Hans Sondergaard, Bent Thomsen, and Anders~P. Ravn. A profile for safety critical java. In {\em 10th IEEE International Symposium on Object and Component-Oriented Real-Time Distributed Computing (ISORC'07)}, pages 94--101, Santorini Island, Greece, May 2007. IEEE Computer Society. \item Martin Schoeberl. Mission modes for safety critical java. In {\em 5th IFIP Workshop on Software Technologies for Future Embedded \& Ubiquitous Systems}, May 2007. \item Raimund Kirner and Martin Schoeberl. Modeling the function cache for worst-case execution time analysis. In {\em Proceedings of the 44rd Design Automation Conference, DAC 2007}, San Diego, CA, USA, June 2007. ACM. \item Martin Schoeberl. A time-triggered network-on-chip. In {\em International Conference on Field-Programmable Logic and its Applications (FPL 2007)}, Amsterdam, Netherlands, August 2007. \item Christof Pitter and Martin Schoeberl. Time predictable {CPU} and {DMA} shared memory access. In {\em International Conference on Field-Programmable Logic and its Applications (FPL 2007)}, Amsterdam, Netherlands, August 2007. \item Martin Schoeberl. A {Java} processor architecture for embedded real-time systems. {\em Journal of Systems Architecture}, doi:10.1016/j.sysarc.2007.06.001, 2007. \item Wolfgang Puffitsch and Martin Schoeberl. {picoJava-II} in an {FPGA}. In {\em Proceedings of the 5th international workshop on Java technologies for real-time and embedded systems (JTRES 2007)}, Vienna, Austria, September 2007. ACM Press. \item Martin Schoeberl. Architecture for object oriented programming languages. In {\em Proceedings of the 5th international workshop on Java technologies for real-time and embedded systems (JTRES 2007)}, Vienna, Austria, September 2007. ACM Press. \item Christof Pitter and Martin Schoeberl. Towards a {Java} multiprocessor. In {\em Proceedings of the 5th international workshop on Java technologies for real-time and embedded systems (JTRES 2007)}, Vienna, Austria, September 2007. ACM Press. \item Martin Schoeberl and Jan Vitek. Garbage collection for safety critical {Java}. In {\em Proceedings of the 5th international workshop on Java technologies for real-time and embedded systems (JTRES 2007)}, Vienna, Austria, September 2007. ACM Press. \end{itemize}

     
    Copyright (c) 1999 OPENCORES.ORG. All rights reserved.