In the end of the day, the Python interpreter is C++ software
Most commonly, efficient Python code is written in C++ and then wrapped into Python
Put it simply, the JVM too is C++ software
Can Python code be written in Java and then wrapped into Python?
Apparently yes, via several ad-hoc bridging technologies:
Why exactly JPype?
Ensure you have a JVM installed on your system
JAVA_HOME environment variable set to the JVM installation directoryEnsure your compiled Java code is available as a .jar file
/path/to/my.jarInstall the JPype package via pip install JPype1
You first need JPype to start a JVM instance in your Python process
JAVA_HOME environment variableimport jpype
# start the JVM
jpype.startJVM(classpath=["/path/to/my.jar"])
Once the JVM is started, one can import Java classes and call their methods as if they were Python objects
import jpype.imports # this is necessary to import Java classes
from java.lang import System # import the java.lang.System class
System.out.println("Hello World!")
Overview on the official documentation
Java classes are presented wherever possible similar to Python classes
the only major difference is that Java classes and objects are closed and cannot be modified
Overview on the official documentation
Java exceptions extend from Python exceptions
Java exceptions can be dealt with in the same way as Python native exceptions
try-except blocksJException serves as the base class for all Java exceptions
Overview on the official documentation
most Python primitives directly map into Java primitives
however, Python does not have the same primitive types…
… hence, explicit casts may be needed in some cases
each primitive Java type is exposed in JPype (jpype.JBoolean, .JByte, .JChar, .JShort, .JInt, .JLong, .JFloat, .JDouble).
Overview on the official documentation
Java strings are similar to Python strings
they are both immutable and produce a new string when altered
most operations can use Java strings in place of Python strings
when comparing or using strings as dictionary keys, all JString objects should be converted to Python
Overview on the official documentation
Java arrays are mapped to Python lists
more precisely, they operate like Python lists, but they are fixed in size
reading a slice from a Java array returns a view of the array, not a copy
passing a slide of a Python list to Java will create a copy of the sub-list
Overview on the official documentation
Java collections are overloaded with Python syntax where possible
Java’s Iterables are mapped to Python iterables by overriding the __iter__ method
Java’s Collections are mapped to Python containers by overriding __len__
Java’s Maps support Python’s dictionaries syntax by overriding __getitem__ and __setitem__
Java’s Lists support Python’s lists syntax by overriding __getitem__ and __setitem__
Overview on the official documentation
Java interfaces can be implemented in Python, via JPype’s decorators
Java’s open / abstract classes cannot be extended in Python
Python lambda expressions can be cast’d to Java’s functional interfaces
Overview on the official documentation
none, there is no way to convert
explicit (E), JPype can convert the desired type, but only explicitly via casting
implicit (I), JPype will convert as needed
exact (X), like implicit, but takes priority in overload selection
Consider the following example of Python code with JPype:
import jpype.imports
from java.lang import System
System.out.println(1)
System.out.println(2.0)
System.out.println('A')
Which overload of System.out.println is called among the many admissible ones?
1 is convertible to Java’s int, long, and short
int is the exact match2.0 is convertible to Java’s float and double
double is the exact match'A' is convertible to Java’s String and char
String is the exact matchConsider the following example of Python code with JPype:
import jpype
csv = jpype.JPackage("io.github.gciatto.csv.Csv")
csv.headerOf(["filed", "another field"])
This would raise the following error:
TypeError: Ambiguous overloads found for io.github.gciatto.csv.Csv.headerOf(list) between:
public static final io.github.gciatto.csv.Header io.github.gciatto.csv.Csv.headerOf(java.lang.Iterable)
public static final io.github.gciatto.csv.Header io.github.gciatto.csv.Csv.headerOf(java.lang.String[])
list is convertible to both Java’s Iterable and String[]
To solve this issue, one can explicitly cast the Python list to the desired Java type:
import jpype
import jpype.imports
from java.lang import Iterable as JIterable
csv = jpype.JClass("io.github.gciatto.csv.Csv")
csv.headerOf(JIterable@["field", "another field"])
# returns Header("field", "another field")
One may customise the behaviour of Java types in Python by providing custom implementations for them
@JImplementationFor decoratorIn that case the special method __jclass_init__ is called on the custom implementation, just once, to configure the class
In type hierarchies, implementations provided for superclasses are inherited by subclasses
Consider for instance the following customisations, allowing to use Java collections with Python syntax
from typing import Iterable, Sequence
@jpype.JImplementationFor("java.lang.Iterable")
class _JIterable:
def __jclass_init__(self):
Iterable.register(self) # makes this class a subtype of Iterable, to speed up isinstance checks
def __iter__(self):
return self.iterator()
@jpype.JImplementationFor("java.util.Collection")
class _JCollection:
def __len__(self):
return self.size() # supports "len(coll)" syntax
def __delitem__(self, i):
return self.remove(i) # supports "del coll[i]" syntax
def __contains__(self, i):
return self.contains(i) # supports "i in coll" syntax
# __iter__ is inherited from _JIterable
# because in Java: Collection extends Iterable
@jpype.JImplementationFor('java.util.List')
class _JList(object):
def __jclass_init__(self):
Sequence.register(self) # makes this class a subtype of Sequence, to speed up isinstance checks
def __getitem__(self, ndx):
return self.get(ndx) # supports "list[i]" syntax
def append(self, obj):
return self.add(obj) # supports "list.append(obj)" syntax
# __len__, __delitem__, __contains__, __iter__ are inherited from _JCollection
this is taken directly from JPype’s codebase
The code wrapped via JPype is not Pythonic by default
It is important to make the wrapped code as Pythonic as possible
io.github.gciatto.csv.Csv $\rightarrow$ jcsv.Csvsnake_case instead of camelCase__len__ for java.util.Collection__getitem__ for java.util.ListAll such refinements can be done in JPype via customisations of the Java types
For all public types in the wrapped Java library:
jcsv package (pt. 1)The jcsv package is a Pythonic wrapper for our JVM-based io.github.gciatto.csv library
Java’s type definition are brought to Python in jcsv/__init__.py:
import jpype
import jpype.imports
from java.lang import Iterable as JIterable
_csv = jpype.JPackage("io.github.gciatto.csv")
Table = _csv.Table
Row = _csv.Row
Record = _csv.Record
Header = _csv.Header
Formatter = _csv.Formatter
Parser = _csv.Parser
Configuration = _csv.Configuration
Csv = _csv.Csv
CsvJvm = _csv.CsvJvm
making it possible to write the following code on the user side:
from jcsv import Table, Record, Header
jcsv package (pt. 2)Parsing and formatting operations are mapped straightforwardly to Python functions:
# jcsv/__init__.py
def parse_csv_string(string, separator = Csv.DEFAULT_SEPARATOR, delimiter = Csv.DEFAULT_DELIMITER, comment = Csv.DEFAULT_COMMENT):
return Csv.parseAsCSV(string, separator, delimiter, comment)
def parse_csv_file(path, separator = Csv.DEFAULT_SEPARATOR, delimiter = Csv.DEFAULT_DELIMITER, comment = Csv.DEFAULT_COMMENT):
return CsvJvm.parseCsvFile(str(path), separator, delimiter, comment)
def format_as_csv(rows, separator = Csv.DEFAULT_SEPARATOR, delimiter = Csv.DEFAULT_DELIMITER, comment = Csv.DEFAULT_COMMENT):
return Csv.formatAsCSV(JIterable@rows, separator, delimiter, comment)
jcsv package (pt. 3)Ad-hoc factory method is provided for building Header instances:
# jcsv/__init__.py
from jcsv.python import iterable_or_varargs
def header(*args):
if len(args) == 1 and isinstance(args[0], int):
return Csv.anonymousHeader(args[0])
return iterable_or_varargs(args, lambda xs: Csv.headerOf(JIterable@map(str, xs)))
making it possible to write the following code on the user side:
import jcsv
header1 = jcsv.header("column1", "column2", "column3")
header2 = jcsv.header(3) # anonymous header with 3 columns
columns = (f"column{i}" for i in range(1, 4)) # generator expression
header3 = jcsv.header(columns) # same as header1, but passing an interable
Function iterable_or_varargs aims at simulating multiple overloads:
# jcsv/python.py
from typing import Iterable
def iterable_or_varargs(args, f):
assert isinstance(args, Iterable)
if len(args) == 1:
item = args[0]
if isinstance(item, Iterable):
return f(item)
else:
return f([item])
else:
return f(args)
jcsv package (pt. 4)Ad-hoc factory method is provided for building Record instances:
# jcsv/__init__.py
def record(header, *args):
return iterable_or_varargs(args, lambda xs: Csv.recordOf(header, JIterable@map(str, xs)))
Ad-hoc factory method is provided for building Table instances:
# jcsv/__init__.py
def __ensure_header(h):
return h if isinstance(h, Header) else header(h)
def __ensure_record(r, h):
return r if isinstance(r, Record) else record(h, r)
def table(header, *args):
header = __ensure_header(header)
args = [__ensure_record(row, header) for row in args]
return iterable_or_varargs(args, lambda xs: Csv.tableOf(header, JIterable@xs))
jcsv package (pt. 5)The Row class is customised to make it more Pythonic:
# jcsv/__init__.py
@jpype.JImplementationFor("io.github.gciatto.csv.Row")
class _Row:
def __len__(self):
return self.getSize()
def __getitem__(self, item):
if isinstance(item, int) and item < 0:
item = len(self) + item
try:
return self.get(item)
except _java.IndexOutOfBoundsException as e:
raise IndexError(f"index {item} out of range") from e
@property
def size(self):
return len(self)
len(row) instead of row.getSize()row[i] instead of row.get(i)row[-i] instead of row.get(row.getSize() - i - 1)IndexError be raised instead of IndexOutOfBoundsExceptionrow.size instead of row.getSize()jcsv package (pt. 6)The Header shall inherit all customisation for Row, plus the following ones:
@jpype.JImplementationFor("io.github.gciatto.csv.Header")
class _Header:
@property
def columns(self):
return [str(c) for c in self.getColumns()]
def __contains__(self, item):
return self.contains(item)
def index_of(self, column):
return self.indexOf(column)
header.columns instead of header.getColumns()column in header instead of header.contains(column)header.index_of(column) instead of header.indexOf(column)jcsv package (pt. 7)The Record shall inherit all customisation for Row, plus the following ones:
@jpype.JImplementationFor("io.github.gciatto.csv.Record")
class _Record:
@property
def header(self):
return self.getHeader()
@property
def values(self):
return [str(v) for v in self.getValues()]
def __contains__(self, item):
return self.contains(item)
record.header instead of record.getHeader()record.values instead of record.getValues()value in record instead of record.contains(value)jcsv package (pt. 8)The Table class is customised too, to make it more Pythonic:
@jpype.JImplementationFor("io.github.gciatto.csv.Table")
class _Table:
@property
def header(self):
return self.getHeader()
def __len__(self):
return self.getSize()
def __getitem__(self, item):
if isinstance(item, int) and item < 0:
item = len(self) + item
try:
return self.get(item)
except _java.IndexOutOfBoundsException as e:
raise IndexError(f"index {item} out of range") from e
@property
def records(self):
return self.getRecords()
@property
def size(self):
return len(self)
table.header instead of table.getHeader()len(table) instead of table.getSize()table[i] instead of table.get(i)table[-i] instead of table.get(table.getSize() - i - 1)record in table instead of table.contains(record)table.records instead of table.getRecords().jars in JPype projects (pt. 1)csv-python/
├── build.gradle.kts # this is where the generation of csv.jar is automated
├── jcsv
│ ├── __init__.py
│ ├── jvm
│ │ ├── __init__.py # this is where JPype is loaded
│ │ └── csv.jar # this the Fat-JAR of the JVM-based library
│ └── python.py
├── requirements.txt
└── test
├── __init__.py
├── test_parsing.py
└── test_python_api.py
We need to ensure that the JVM-based library is available on the system where jcsv is installed
The build.gradle.kts file automates the generation of the csv.jar file
jcsv/jvm directoryThe jcsv/jvm/__init__.py file loads JPype and the csv.jar file
.jars in JPype projects (pt. 2)Snippet from the build.gradle.kts:
tasks.create<Copy>("createCoreJar") {
group = "Python"
val shadowJar by project(":csv-core").tasks.getting(Jar::class)
dependsOn(shadowJar)
from(shadowJar.archiveFile) {
rename(".*?\\.jar", "csv.jar")
}
into(projectDir.resolve("jcsv/jvm"))
}
Content of the jcsv/jvm/__init__.py file:
import jpype
from pathlib import Path
# the directory where csv.jar is placed
CLASSPATH = Path(__file__).parent
# the list of all .jar files in CLASSPATH
JARS = [str(j.resolve()) for j in CLASSPATH.glob('*.jar')]
jpype.startJVM(classpath=JARS)
Important line in jcsv/__init__.py:
import jcsv.jvm
this is forcing the startup of the JVM with the correct classpath whenever someone is using the jcsv module
We need to ensure that some JVM is available on the system where jcsv is installed
Notice that the JVM is available as a Python dependency too:
This means that the JVM can be automatically downloaded and installed via pip:
pip install jdk4py
… or added as a dependency to the requirements.txt file:
JPype1==1.4.1
jdk4py==17.0.7.0
so, one may simply need to configure JPype to use that JVM:
# jcsv/jvm/__init__.py
import jpype, sys
from jdk4py import JAVA_HOME
def jvm_lib_file_names():
if sys.platform == "win32":
return {"jvm.dll"}
elif sys.platform == "darwin":
return {"libjli.dylib"}
else:
return {"libjvm.so"}
def jvmlib():
for name in __jvm_lib_file_names():
for path in JAVA_HOME.glob(f"**/{name}"):
if path.exists:
return str(path)
return None
jpype.startJVM(jvmpath=jvmlib())
Unit tests are essential to ensure the correctness of the Pythonic API
Consider for instance tests in:
test/test_parsing.pytest/test_python_api.pyIt is important to test all the costumisations and factory methods
giovanni.ciatto@unibo.itCompiled on: 2024-02-20