程序師世界是廣大編程愛好者互助、分享、學習的平台,程序師世界有你更精彩!
首頁
編程語言
C語言|JAVA編程
Python編程
網頁編程
ASP編程|PHP編程
JSP編程
數據庫知識
MYSQL數據庫|SqlServer數據庫
Oracle數據庫|DB2數據庫
 程式師世界 >> 編程語言 >> JAVA編程 >> 關於JAVA >> 數組協變帶來的靜態類型漏洞

數組協變帶來的靜態類型漏洞

編輯:關於JAVA

在剛才一個通不過Java字節碼校驗的例子裡,我們看到JVM會對其所加載的.class文件做校驗,以保證 類型安全。但Java裡有這麼一種情況,是編譯器和JVM的字節碼校驗都無法檢測到,而要到實際運行的時 候才能發現的錯誤——數組的協變導致的類型靜態系統漏洞。

還是像前一帖一樣,用ASM來生成字節碼:

Java代碼

import java.io.FileOutputStream;
import org.objectweb.asm.ClassWriter;
import org.objectweb.asm.MethodVisitor;
import org.objectweb.asm.Opcodes;

public class TestASM implements Opcodes {
    public static void main(String[] args) throws Exception {
        ClassWriter cw = new ClassWriter(0);
        cw.visit(
            V1_5,               // class format version
            ACC_PUBLIC,         // class modifiers
            "TestVerification", // class name fully qualified name
            null,               // generic signature
            "java/lang/Object", // super class fully qualified name
            new String[] { }    // implemented interfaces
        );
        
        MethodVisitor mv = cw.visitMethod(
            ACC_PUBLIC + ACC_STATIC,   // access modifiers
            "main",                    // method name
             "([Ljava/lang/String;)V", // method description
             null,                     // generic signature
             null                      // exceptions
        );
        mv.visitCode();
        mv.visitInsn(ICONST_1);
        mv.visitTypeInsn(ANEWARRAY, "java/lang/Float");
        mv.visitTypeInsn(CHECKCAST, "[Ljava/lang/Object;");
        mv.visitVarInsn(ASTORE, 0);
        mv.visitVarInsn(ALOAD, 0);
        mv.visitInsn(ICONST_0);
        mv.visitLdcInsn("a string");
        mv.visitInsn(AASTORE);
        mv.visitVarInsn(ALOAD, 0);
        mv.visitInsn(ICONST_0);
        mv.visitInsn(AALOAD);
        mv.visitMethodInsn(INVOKEVIRTUAL, "java/lang/Object", "toString", "()V");
        mv.visitInsn(RETURN);
        mv.visitMaxs(3, 1);
        mv.visitEnd(); // end method
        cw.visitEnd(); // end class
        
        byte[] clz = cw.toByteArray();
        FileOutputStream out = new FileOutputStream("TestVerification.class");
        out.write(clz);
        out.close();
    }
}

得到的是:

Java bytecode代碼

public class TestVerification extends java.lang.Object
  minor version: 0
  major version: 49
  Constant pool:
const #1 = Asciz        TestVerification;
const #2 = class        #1;     //  TestVerification
const #3 = Asciz        java/lang/Object;
const #4 = class        #3;     //  java/lang/Object
const #5 = Asciz        main;
const #6 = Asciz        ([Ljava/lang/String;)V;
const #7 = Asciz        java/lang/Float;
const #8 = class        #7;     //  java/lang/Float
const #9 = Asciz        [Ljava/lang/Object;;
const #10 = class       #9;     //  "[Ljava/lang/Object;"
const #11 = Asciz       a string;
const #12 = String      #11;    //  a string
const #13 = Asciz       toString;
const #14 = Asciz       ()V;
const #15 = NameAndType #13:#14;//  toString:()V
const #16 = Method      #4.#15; //  java/lang/Object.toString:()V
const #17 = Asciz       Code;

{
public static void main(java.lang.String[]);
  Code:
   Stack=3, Locals=1, Args_size=1
   0:   iconst_1
   1:   anewarray       #8; //class java/lang/Float
   4:   checkcast       #10; //class "[Ljava/lang/Object;"
   7:   astore_0
   8:   aload_0
   9:   iconst_0
   10:  ldc     #12; //String a string
   12:  aastore
   13:  aload_0
   14:  iconst_0
   15:  aaload
   16:  invokevirtual   #16; //Method java/lang/Object.toString:()V
   19:  return

}

這次的代碼其實直接用Java源碼也能表示出來,也就是:

Java代碼

public class TestVerification {
    public static void main(String[] args) {
        Object[] array = (Object[]) new Float[1];
        array[0] = "a string"; // 問題出在這裡
        array[0].toString();
    }
}

編譯不會有任何問題。這代碼也是完全符合Java規范,也滿足JVM的靜態校驗對類型的要求,所以加載 時的校驗也沒問題。

但是運行的話……

Command prompt代碼

Exception in thread "main" java.lang.ArrayStoreException: java.lang.String
        at TestVerification.main(Unknown Source)

很明顯我們沒辦法把一個String類型的對象保存到一個Float[]裡,但由於Java數組是協變的,所以 Java的靜態類型系統允許我們這麼做,卻會到運行時扔異常出來。

.NET很不幸的模仿了Java的這個特性,也把數組設計為協變的。因而CLI與JVM一樣(JVM:aastore; CLI:stelem),也必須在運行時對數組的保存做動態類型檢查。這對性能的影響自然不太好,而且也使 得VM的實現更復雜……诶。

《Virtual Machines: Versatile Platforms for Systems and Processes》影印版第289頁倒數第二 段提到:

引用

Hence, if an object is accessed, the field information for the access can also be checked statically (there is an exception for arrays, given in the next paragraph).

然後在接下來的一段裡,這本書卻只提到了動態檢查數組訪問時越界檢查,而沒有提到由協變帶來的 靜態類型漏洞。我覺得這裡還是提一下協變問題比較好的。畢竟,數組長度並不是Java的靜態類型的一部 分,它的檢查只能留待運行時檢查(VM可以根據數據流分析而消除許多數組越界和空指針檢查就是了); 而類型協變是靜態類型系統的一部分,卻有漏洞所以運行時仍然要檢查,這就不爽了。

看看Martin Odersky在最近的一個訪談裡對Java數組的協變的評論:

Martin Odersky 寫道

Bill Venners: You said you found it frustrating at times to have the constraints of needing to be backwards compatible with Java. Can you give some specific examples of things you couldn't do when you were trying to live within those constraints, which you were then able to do when you changed to doing something that's binary but not source compatible?

Martin Odersky: In the generics design, there were a lot of very, very hard constraints. The strongest constraint, the most difficult to cope with, was that it had to be fully backwards compatible with ungenerified Java. The story was the collections library had just shipped with 1.2, and Sun was not prepared to ship a completely new collections library just because generics came about. So instead it had to just work completely transparently.

That's why there were a number of fairly ugly things. You always had to have ungenerified types with generified types, the so called raw types. Also you couldn't change what arrays were doing so you had unchecked warnings. Most importantly you couldn't do a lot of the things you wanted to do with arrays, like generate an array with a type parameter T, an array of something where you didn't know the type. You couldn't do that. Later in Scala we actually found out how to do that, but that was possible only because we could drop in Scala the requirement that arrays are covariant.

Bill Venners: Can you elaborate on the problem with Java's covariant arrays?

Martin Odersky: When Java first shipped, Bill Joy and James Gosling and the other members of the Java team thought that Java should have generics, only they didn't have the time to do a good job designing it in. So because there would be no generics in Java, at least initially, they felt that arrays had to be covariant. That means an array of String is a subtype of array of Object, for example. The reason for that was they wanted to be able to write, say, a “generic” sort method that took an array of Object and a comparator and that would sort this array of Object. And then let you pass an array of String to it. It turns out that this thing is type unsound in general. That's why you can get an array store exception in Java. And it actually also turns out that this very same thing blocks a decent implementation of generics for arrays. That's why arrays in Java generics don't work at all. You can't have an array of list of string, it's impossible. You're forced to do the ugly raw type, just an array of list, forever. So it was sort of like an original sin. They did something very quickly and thought it was a quick hack. But it actually ruined every design decision later on. So in order not to fall into the same trap again, we had to break off and say, now we will not be upwards compatible with Java, there are some things we want to do differently.

P.S. 不知道協變是什麼的同學可以讀讀Wikipedia上的詞條

P.P.S 不認識Martin Odersky的同學請留意:只要用到Java 5的泛型,你們的代碼裡就有他的痕跡。 他是Pizza語言的設計者,後來參與了GJ(Generic Java)的設計;後者就是後來Java 5中的泛型的基石 。Martin還設計了Scala << 知道Scala的人肯定比知道Pizza的多多了……

  1. 上一頁:
  2. 下一頁:
Copyright © 程式師世界 All Rights Reserved